#151: Renamed some fields for future expansion#150
Conversation
4da3d9c to
5add1c6
Compare
Signed-off-by: Cliff Burdick <cburdick@nvidia.com>
5add1c6 to
ea46c9a
Compare
|
| Filename | Overview |
|---|---|
| include/daqiri/types.h | Adds config_engine_from_string/config_engine_to_string, new helper predicates (is_explicit_manager_type, manager_type_supports_stream_type, manager_type_supports_socket_protocol), adds engine and default-initialises manager_type on CommonConfig, and adds local_addr_/remote_addr_ to SocketConfig. Updates get_rdma_configs_enabled to detect RDMA via URI schemes. |
| include/daqiri/common.h | Major refactor of YAML decode: drops the required top-level protocol: field in favour of URI schemes inferred from local_addr/remote_addr. Adds resolve_manager_type lambda called twice (before and after per-interface parse_socket_config). Backward compat via legacy local_ip/local_port + top-level protocol: preserved. rdma_.port_ assignment for RoCE clients changed — see comment. |
| src/common.cpp | Adds URI endpoint parser (parse_endpoint_addr, apply_endpoint_addr, parse_endpoint_query, make_endpoint_addr) in an anonymous namespace; extends parse_socket_config to accept the protocol by ref and infer it from URI schemes. Validation logic is thorough; RoCE client remote_addr is explicitly rejected with a clear error. |
| src/manager.cpp | Extends get_manager_type to check the new engine: field first, then fall back to protocol inference from local_addr/remote_addr URI schemes. config_engine_from_string can still throw inside the existing catch(std::exception) that silently returns default (pre-existing issue flagged in a prior review thread). |
| src/managers/socket/daqiri_socket_mgr.cpp | Minimal change: reformats is_roce_protocol() and updates an error message string to reference roce:// endpoints instead of protocol=roce. |
| python/daqiri_common_pybind.cpp | Exposes new engine field on CommonConfig and local_addr/remote_addr on SocketConfig to Python bindings. Straightforward additive change. |
| examples/daqiri_bench_rdma_tx_rx.yaml | Converts all socket configs from local_ip/local_port/remote_ip/remote_port + top-level protocol: "roce" to local_addr: "roce://host[:port]" URI form. Also fills in rdma_bench_client.client_address which was empty in the old file. |
| docs/api-reference/configuration.md | Replaces protocol: key documentation with engine: and URI scheme guidance; documents socket_config.local_addr/remote_addr alongside legacy fields; marks the RoCE example using the new URI form. |
Reviews (3): Last reviewed commit: "Renamed some fields for future expansion" | Re-trigger Greptile
| if (is_known_but_unavailable) { | ||
| throw std::invalid_argument( | ||
| "Engine '" + str + "' is not available in this build. " | ||
| "Available engines: " + available_engines + ". " | ||
| "To enable '" + str + "', rebuild with CMake option: " | ||
| "-DDAQIRI_MGR=\"" + available_engines + " " + | ||
| (str == DAQIRI_ENGINE_STR__IBVERBS ? DAQIRI_MGR_STR__RDMA : str) + | ||
| "\""); | ||
| } |
There was a problem hiding this comment.
CMake rebuild suggestion uses YAML engine name instead of CMake module name
The available_engines string is built from YAML-layer names ("dpdk", "socket", "ibverbs"), but it is then embedded verbatim inside a -DDAQIRI_MGR="..." CMake suggestion. The YAML name for the RDMA engine is "ibverbs", but the corresponding DAQIRI_MGR value is "rdma". On a build that has SOCKET and RDMA compiled in but not DPDK, a user who writes engine: "dpdk" will see the instruction to rebuild with -DDAQIRI_MGR="socket ibverbs dpdk". Passing "ibverbs" to DAQIRI_MGR is not a valid CMake token; the correct command is -DDAQIRI_MGR="socket rdma dpdk". The (str == DAQIRI_ENGINE_STR__IBVERBS ? DAQIRI_MGR_STR__RDMA : str) expression on line 459 fixes the requested engine, but leaves the already-present engines in available_engines with their YAML names. A separate available_cmake_names string (substituting DAQIRI_MGR_STR__RDMA for DAQIRI_ENGINE_STR__IBVERBS) should be used in the CMake suggestion.
| if (node["engine"].IsDefined()) { | ||
| input_spec.common_.engine = | ||
| daqiri::config_engine_from_string(node["engine"].as<std::string>()); |
There was a problem hiding this comment.
config_engine_from_string throws but the YAML decode path has no try/catch here
Every other validation in this decode function follows the pattern DAQIRI_LOG_ERROR(…); return false;. config_engine_from_string breaks that contract by throwing std::invalid_argument for any unrecognised or unavailable engine value. The call on line 952 is not wrapped in a try/catch, so an invalid engine: field propagates as an unhandled exception out of the YAML decode function — bypassing the structured error log and producing an opaque crash instead of a clean parse failure. Either catch the exception here (logging it via DAQIRI_LOG_ERROR and returning false), or change config_engine_from_string to return a sentinel (ManagerType::UNKNOWN) and let the caller emit the diagnostic in the usual way.
| const std::string engine_str = node["engine"].template as<std::string>(""); | ||
| if (!engine_str.empty() && engine_str != DAQIRI_MGR_STR__DEFAULT) { | ||
| return config_engine_from_string(engine_str); | ||
| } |
There was a problem hiding this comment.
Invalid
engine: is silently swallowed here, returning the wrong manager type
config_engine_from_string throws std::invalid_argument on an unknown or unavailable engine string. This call sits inside the try { … } catch (const std::exception& e) { return get_default_manager_type(); } block at line 157, so an invalid YAML engine: field silently returns the default manager type instead of surfacing an error. The caller proceeds to instantiate and initialize a manager that does not match the user's intent, and the root cause — the bad engine string — is never logged. Consider returning ManagerType::UNKNOWN on error (rather than throwing) in config_engine_from_string, or re-throw/log the exception before returning the default.
| | --- | --- | | ||
| | `manager_type_from_string(str)` / `manager_type_to_string(type)` | Convert manager types. | | ||
| | `stream_type_from_string(str)` / `stream_type_to_string(type)` | Convert stream types. | | ||
| | `socket_protocol_from_string(str)` / `socket_protocol_to_string(protocol)` | Convert socket protocols. | | ||
| | `reorder_data_type_from_string(str)` / `reorder_data_type_to_string(type)` | Convert reorder data types. | | ||
| | `reorder_endianness_from_string(str)` / `reorder_endianness_to_string(endianness)` | Convert reorder endianness values. | | ||
| | `log_level_from_string(str)` / `log_level_to_string(level)` | Convert log levels. | |
There was a problem hiding this comment.
Python docs remove
SocketProtocol / socket_protocol_from_string / IPPROTO_UDP, but the pybind code still exposes them
python/daqiri_common_pybind.cpp still binds the SocketProtocol enum (lines 348-352), IPPROTO_UDP (line 665), and socket_protocol_from_string/socket_protocol_to_string (lines 712-713). The CommonConfig.protocol attribute (which this PR preserves) returns a SocketProtocol value, so the enum remains necessary for any code that inspects parsed config programmatically. Removing it from the docs while keeping it in the bindings is a confusing API contract: callers inspecting .protocol on a CommonConfig returned by parse_network_config will encounter values of an undocumented type. Either restore the SocketProtocol table entry (treating the old protocol-based API as deprecated), or remove the corresponding pybind bindings and deprecate CommonConfig.protocol access in the same PR.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
aa63c1e to
ea46c9a
Compare
…l field Port the field-expansion model (originally ea46c9a / PR #150) onto the engine-rename branch, expressed in engine naming: - Endpoints carry the protocol as a URI scheme (tcp://, udp://, roce://) in socket_config.local_addr / remote_addr. The separate protocol: YAML field is removed; protocol is derived from the address scheme (protocol_from_endpoint_addr in engine.cpp). - Optional engine: override field (config_engine_from_string); accepts dpdk / socket / ibverbs, rejects 'rdma' in favor of 'ibverbs'. When unset, engine is auto-resolved from stream_type + protocol. - CommonConfig gains 'engine' (override) alongside 'engine_type' (resolved). pybind exposes both. - Example socket/RDMA YAMLs converted to URI endpoints. Docs updated: configuration.md drops protocol and documents engine; concepts.md, configuration-walkthrough.md, README, benchmarks, socket_benchmarking use URI schemes. - Fix a rename-introduced bug: the legacy-key check now rejects the old 'manager:' key (the mechanical rename had made it wrongly reject the new 'engine:' field). Verified: full container rebuild (DAQIRI_ENGINE="dpdk ibverbs", exit 0); socket UDP run parses udp:// endpoints and derives protocol from the scheme; raw sw-loopback TX==RX. Signed-off-by: Cliff Burdick <cburdick@nvidia.com>
No description provided.