Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,13 @@ DAQIRI provides direct NIC hardware access in userspace, bypassing the Linux ker
- **Optional OpenTelemetry metrics** — Expose per-interface or per-queue packet,
byte, and drop counters when built with `DAQIRI_ENABLE_OTEL_METRICS=ON`.

### Backends
### Engines

| Backend | Config value | Description |
| Engine | Config selector | Description |
|---------|-------------|-------------|
| DPDK | `dpdk` | Userspace packet processing with DPDK mbufs and rings. |
| RDMA | `rdma` | RDMA verbs via libibverbs over RoCE or InfiniBand (client/server model). |
| Socket | `socket` | Linux kernel sockets (UDP/TCP), plus a RoCE path that delegates to the RDMA backend. Selecting `socket` automatically builds `rdma`. |
| DPDK | `stream_type: "raw"` (optional `engine: "dpdk"`) | Userspace packet processing with DPDK mbufs and rings. |
| Socket | `stream_type: "socket"` with `tcp://` or `udp://` endpoints | Linux kernel sockets for TCP/UDP. |
| RDMA | `stream_type: "socket"` and `roce://` endpoints | RDMA verbs via libibverbs over RoCE or InfiniBand (client/server model). |

### Limitations

Expand Down Expand Up @@ -112,7 +112,7 @@ Reference material for the DAQIRI codebase:
- [Getting Started](https://nvidia.github.io/daqiri/getting-started/) — System requirements, build/install instructions, and CMake options
- [Concepts](https://nvidia.github.io/daqiri/concepts/) — Glossary of DAQIRI terminology (kernel bypass, GPUDirect, packet/burst/segment, flow/queue, memory region, zero-copy ownership, RX reorder). Meant to be opened in parallel with the rest of the docs.
- [API Guide](https://nvidia.github.io/daqiri/api-reference/) — Six-step DAQIRI application lifecycle and configuration-first model
- [Configuration YAML Reference](https://nvidia.github.io/daqiri/api-reference/configuration/) — Full YAML config reference for all backends
- [Configuration YAML Reference](https://nvidia.github.io/daqiri/api-reference/configuration/) — Full YAML config reference for all engines
- [C++ API Usage](https://nvidia.github.io/daqiri/api-reference/cpp/) — C++ RX/TX workflows, buffer lifecycle, file writing, utilities, and status codes
- [Python API Usage](https://nvidia.github.io/daqiri/api-reference/python/) — Python bindings, workflow examples, enums, config classes, and helper functions
- [Contributing](CONTRIBUTING.md) — Contribution guidelines, coding standards, DCO sign-off
Expand Down
47 changes: 31 additions & 16 deletions docs/api-reference/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,12 @@ These settings apply globally to both TX and RX:
- **`stream_type`**: Packet I/O stream class.
- type: `string`
- values: `raw`, `socket`
- **`protocol`**: Socket stream protocol. Required when `stream_type: "socket"` and invalid
for `stream_type: "raw"`.
- **`engine`**: Optional implementation engine for the selected stream type. Omit this
unless you need a specific implementation override. RoCE configs infer `ibverbs`
from `roce://` endpoint URIs by default.
- type: `string`
- values: `tcp`, `udp`, `roce`
- **`log_level`**: Backend log level.
- values: `dpdk`, `socket`, `ibverbs`
- **`log_level`**: Engine log level.
- type: `string`
- values: `trace`, `debug`, `info`, `warn` (default), `error`, `critical`, `off`
- **`loopback`**: Enable software loopback for testing without a physical link.
Expand Down Expand Up @@ -111,21 +112,37 @@ memory_regions:
- **`name`**: Interface name. Used to look up port IDs at runtime via `get_port_id()`.
- type: `string`
- **`address`**: PCIe BDF address (from `lspci`) or Linux interface name for Raw Ethernet
(`stream_type: "raw"`), or IP address for RoCE (`stream_type: "socket"`,
`protocol: "roce"`).
(`stream_type: "raw"`), or IP address for RoCE (`stream_type: "socket"` with a
`roce://` endpoint).
- type: `string`

### RDMA Configuration
### Socket and RDMA Endpoint Configuration

When using RDMA, set `stream_type: "socket"` and `protocol: "roce"`. Each interface then
uses a `socket_config` block for endpoint role/addressing plus a `roce_config` block for
RDMA transport settings:
Socket-style streams use a `socket_config` block for endpoint role and addressing.
Endpoint addresses are URI strings. Supported schemes are `tcp://`, `udp://`, and
`roce://` (`rdma://` is still accepted as a legacy alias).

- **`socket_config.mode`**: Connection role.
- type: `string`
- values: `client`, `server`
- **`socket_config.local_ip`** / **`socket_config.local_port`**: Server bind address/port.
- **`socket_config.remote_ip`** / **`socket_config.remote_port`**: Client peer address/port.
- **`socket_config.local_addr`**: Local bind endpoint, for example
`tcp://127.0.0.1:6001`, `roce://10.100.3.1:4096`, or
`roce://10.100.1.1` for a RoCE client whose source port is chosen by RDMA CM.
Required for server mode and RoCE client mode.
- **`socket_config.remote_addr`**: Remote peer endpoint, for example
`udp://10.250.0.2:5021`. Required for TCP/UDP client mode. RoCE clients choose
the peer in application code (for example by calling `rdma_connect_to_server`),
not in DAQIRI config.
- **`socket_config.local_ip`** / **`socket_config.local_port`** and
**`socket_config.remote_ip`** / **`socket_config.remote_port`**: Legacy endpoint
fields accepted for older configs when a top-level engine override provides the
transport.

When using RoCE, set `stream_type: "socket"` and use `roce://` endpoint addresses
plus a `roce_config` block for transport settings. A RoCE URI may include
`?engine=ibverbs`; when omitted, `ibverbs` is the default and only supported RoCE
engine.

- **`roce_config.transport_mode`**: RDMA transport type.
- type: `string`
- values: `RC` (Reliable Connected), `UC` (Unreliable Connected)
Expand Down Expand Up @@ -243,7 +260,7 @@ v1 batch-size requirement:

- **`name`**: Reorder config name. Must be unique per interface.
- type: `string`
- **`reorder_type`**: Reorder backend implementation.
- **`reorder_type`**: Reorder implementation.
- type: `string`
- values: `gpu`, `cpu`
- **`memory_region`**: Output memory region where reordered payload is written.
Expand Down Expand Up @@ -406,7 +423,6 @@ daqiri:
cfg:
version: 1
stream_type: "socket"
protocol: "roce"
master_core: 3
debug: false
log_level: "info"
Expand All @@ -428,8 +444,7 @@ daqiri:
address: 10.100.3.1
socket_config:
mode: server
local_ip: 10.100.3.1
local_port: 4096
local_addr: "roce://10.100.3.1:4096"
roce_config:
transport_mode: RC
rx:
Expand Down
4 changes: 2 additions & 2 deletions docs/api-reference/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ For the terminology and conceptual background it relies on

A DAQIRI application starts from a YAML configuration file (or an
equivalent `NetworkConfig` struct built in code). The configuration
defines the active stream type and protocol, NIC interfaces, RX and TX
defines the active stream type, optional engine, endpoint URIs, NIC interfaces, RX and TX
queues, memory regions, flow steering rules, flow isolation,
header-data split, and optional reorder plans. After initialization,
the language API operates on those configured ports, queues, buffers,
Expand All @@ -30,7 +30,7 @@ The language APIs do **not** discover queues, memory, or flow steering
rules on their own. They are runtime handles over the topology declared
in the configuration (YAML file or `NetworkConfig` struct). The
configuration is the source of truth for queue IDs, memory placement,
stream-type / protocol selection, and flow routing.
stream-type / engine / endpoint selection, and flow routing.

The configuration schema lives in the
[Configuration YAML Reference](configuration.md). For an annotated
Expand Down
13 changes: 5 additions & 8 deletions docs/api-reference/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,10 +132,10 @@ Parse without starting the manager:
```python
status, parsed = daqiri.parse_network_config("daqiri_bench_raw_tx_rx.yaml")
if status == daqiri.Status.SUCCESS:
print(parsed.common.manager_type)
print(parsed.common.engine)
```

If GPU RX `reorder_configs` are configured for the DPDK backend, set one CUDA
If GPU RX `reorder_configs` are configured for the DPDK engine, set one CUDA
stream per GPU reorder plan before pulling reordered bursts. Pass the CUDA
stream as an integer address; pass `0` to use the default stream. See the
[Configuration YAML Reference](configuration.md#rx-reorder-configs)
Expand Down Expand Up @@ -198,7 +198,7 @@ flow = daqiri.get_packet_flow_id(burst, idx)
status, rx_ts_ns = daqiri.get_packet_rx_timestamp(burst, idx)
```

RX hardware timestamps are available only when the DPDK backend is configured
RX hardware timestamps are available only when the DPDK engine is configured
with `rx.hardware_timestamps: true` and the NIC supports
`RTE_ETH_RX_OFFLOAD_TIMESTAMP`. See
[C++ API Usage → Receiving Packets](cpp.md#receiving-packets) for the clock
Expand Down Expand Up @@ -491,7 +491,6 @@ The workflow sections above show the common call order and ownership rules.
| --- | --- |
| `manager_type_from_string(str)` / `manager_type_to_string(type)` | Convert manager types. |
| `stream_type_from_string(str)` / `stream_type_to_string(type)` | Convert stream types. |
| `socket_protocol_from_string(str)` / `socket_protocol_to_string(protocol)` | Convert socket protocols. |
| `reorder_data_type_from_string(str)` / `reorder_data_type_to_string(type)` | Convert reorder data types. |
| `reorder_endianness_from_string(str)` / `reorder_endianness_to_string(endianness)` | Convert reorder endianness values. |
| `log_level_from_string(str)` / `log_level_to_string(level)` | Convert log levels. |
Expand Down Expand Up @@ -599,7 +598,6 @@ The workflow sections above show the common call order and ownership rules.
| `MEM_ACCESS_LOCAL` | Local memory access flag. |
| `MEM_ACCESS_RDMA_WRITE` | RDMA write memory access flag. |
| `MEM_ACCESS_RDMA_READ` | RDMA read memory access flag. |
| `IPPROTO_UDP` | UDP IP protocol number used by header helpers. |

## Enums

Expand All @@ -613,7 +611,6 @@ The workflow sections above show the common call order and ownership rules.
| `BufferLocation` | `CPU`, `GPU`, `CPU_GPU_SPLIT` |
| `MemoryKind` | `HOST`, `HOST_PINNED`, `HUGE`, `DEVICE`, `INVALID` |
| `StreamType` | `RAW`, `SOCKET`, `INVALID` |
| `SocketProtocol` | `TCP`, `UDP`, `ROCE`, `INVALID` |
| `LoopbackType` | `DISABLED`, `LOOPBACK_TYPE_SW` |
| `RDMAMode` | `CLIENT`, `SERVER`, `INVALID` |
| `RDMATransportMode` | `RC`, `UC`, `UD`, `INVALID` |
Expand Down Expand Up @@ -641,7 +638,7 @@ names that mostly omit the trailing underscore from the C++ member name (e.g.
| `BurstHeaderParams` | Burst metadata: packet count, port, queue, segment count, byte totals, and reorder flags. |
| `ReorderBurstInfo` | Metadata for reordered aggregate bursts. |
| `NetworkConfig` | Top-level parsed DAQIRI configuration. |
| `CommonConfig` | Global manager, direction, stream, protocol, loopback, and core settings. |
| `CommonConfig` | Global manager, engine, direction, stream, loopback, and core settings. |
| `InterfaceConfig` | Per-interface address, socket/RoCE/RDMA, RX, and TX configuration. |
| `RxConfig` | RX flow isolation, timestamps, queues, flows, flex items, and reorder configs. |
| `TxConfig` | TX accurate-send flag, queues, and flows. |
Expand All @@ -654,7 +651,7 @@ names that mostly omit the trailing underscore from the C++ member name (e.g.
| `FlowConfig` | Named flow rule combining action and match. |
| `FlexItemConfig` | Flexible parser item configuration. |
| `FlexItemMatch` | Flexible parser match value and mask. |
| `SocketConfig` | Socket client/server endpoint and timing settings. |
| `SocketConfig` | Socket client/server endpoint URI, legacy IP/port, and timing settings. |
| `RoCEConfig` | RoCE transport settings. |
| `RDMAConfig` | RDMA mode, transport mode, and port. |
| `ReorderConfig` | Reorder name, type, memory region, payload offset, flows, method, and data type conversion. |
Expand Down
18 changes: 9 additions & 9 deletions docs/benchmarks/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,31 +5,31 @@ hide:

# Benchmarking

DAQIRI ships with several backends to handle different types of incoming and outgoing streams. Choosing the stream type depends on the type of sensor being used and its capabilities. The `stream_type` is decided from the decision tree below:
DAQIRI ships with several stream implementations for incoming and outgoing data. Choose `stream_type` for the stream family, then use endpoint URI schemes when that family has multiple implementations.

![DAQIRI networking backend decision tree](../images/backend-decision-tree.svg)
![DAQIRI networking engine decision tree](../images/backend-decision-tree.svg)

## Choose a backend
## Choose a Stream

| Use case | DAQIRI config | Benchmark | Start here |
|---|---|---|---|
| Ingest from or egress to a programmable PCIe sensor, such as an FPGA on the PCIe bus. | `stream_type: "pcie"` | Coming soon | PCIe benchmarking docs are coming soon. |
| Compare against normal Linux networking, run on a non-NVIDIA NIC, or test a peer that speaks TCP/UDP sockets. | `stream_type: "socket"` with `protocol: "tcp"` or `protocol: "udp"` | `daqiri_bench_socket` | [Socket and RDMA Benchmarking](socket_benchmarking.md) |
| Test a peer that already implements RDMA verbs over RoCE. | `stream_type: "socket"` with `protocol: "roce"` | `daqiri_bench_rdma` | [Socket and RDMA Benchmarking](socket_benchmarking.md#run-the-rdma-roce-benchmark) |
| Compare against normal Linux networking, run on a non-NVIDIA NIC, or test a peer that speaks TCP/UDP sockets. | `stream_type: "socket"` with `tcp://` or `udp://` endpoints | `daqiri_bench_socket` | [Socket and RDMA Benchmarking](socket_benchmarking.md) |
| Test a peer that already implements RDMA verbs over RoCE. | `stream_type: "socket"` and `roce://` endpoints | `daqiri_bench_rdma` | [Socket and RDMA Benchmarking](socket_benchmarking.md#run-the-rdma-roce-benchmark) |
| Drive raw Ethernet packets directly from an NVIDIA NIC under DAQIRI control. | `stream_type: "raw"` | `daqiri_bench_raw_gpudirect` and the other `raw_*` benches | [Raw Ethernet Benchmarking](raw_benchmarking.md) |

!!! note "PCIe backend status"
!!! note "PCIe engine status"

The PCIe programmable-sensor path is under development. Once completed it will allow 3rd party PCIe devices
to read from and write to the GPU's BAR1 memory.

!!! note "Why RDMA is listed under socket"

The RoCE benchmark uses the connection-oriented socket/RDMA configuration model. The executable is named `daqiri_bench_rdma` to show the RDMA-specific API calls.
The RoCE benchmark uses the connection-oriented socket/RDMA configuration model. The executable is named `daqiri_bench_rdma` to show the RDMA-specific API calls.

## Common benchmark workflow

1. Build the examples with the backend you plan to test. The default container build enables all three:
1. Build the examples with the managers you plan to test. The default container build enables all three current managers:

```bash
BASE_TARGET=dpdk DAQIRI_MGR="dpdk socket rdma" scripts/build-container.sh
Expand All @@ -47,7 +47,7 @@ DAQIRI ships with several backends to handle different types of incoming and out

- [Socket and RDMA Benchmarking](socket_benchmarking.md) covers Linux TCP/UDP and RoCE/RDMA runs with matching client/server namespace setup.
- [Raw Ethernet Benchmarking](raw_benchmarking.md) covers the DPDK/raw Ethernet examples, hugepage sizing, physical loopback configuration, and raw benchmark troubleshooting.
- [Understanding the Configuration File](../tutorials/configuration-walkthrough.md) explains the YAML fields once you have selected the backend and example config.
- [Understanding the Configuration File](../tutorials/configuration-walkthrough.md) explains the YAML fields once you have selected the stream and example config.

---
**Previous:** [System Configuration](../tutorials/system_configuration.md)<br>
Expand Down
2 changes: 1 addition & 1 deletion docs/benchmarks/raw_benchmarking.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ DAQIRI provides raw Ethernet benchmark applications that use DPDK to drive an NV

Make sure to [build the DAQIRI library](../getting-started.md#build-the-daqiri-library) beforehand.

**Not sure which backend to benchmark?** Start with the [Benchmarking overview](benchmarks.md). Use this page after you have chosen the raw Ethernet backend. Use [Socket and RDMA Benchmarking](socket_benchmarking.md) for TCP, UDP, and RoCE/RDMA runs.
**Not sure which engine to benchmark?** Start with the [Benchmarking overview](benchmarks.md). Use this page after you have chosen Raw Ethernet. Use [Socket and RDMA Benchmarking](socket_benchmarking.md) for TCP, UDP, and RoCE/RDMA runs.

!!! note "Prerequisites"

Expand Down
Loading
Loading