You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/sphinx/examples_rst/qec/realtime_decoding.rst
+18-72Lines changed: 18 additions & 72 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,12 +11,12 @@ The real-time decoding framework supports two primary deployment scenarios:
11
11
Key Features
12
12
------------
13
13
14
-
* **Low-Latency Decoding**: Syndrome processing and correction calculation within coherence time constraints
15
-
* **Streaming Syndrome Interface**: Continuous syndrome enqueueing from quantum circuits
16
-
* **Multiple Decoder Support**: Concurrent management of multiple logical qubits, each with independent decoder instances
17
-
* **Flexible Configuration**: YAML-based decoder configuration supporting various decoder types and parameters
18
-
* **Device-Agnostic API**: Unified API that works across simulation and hardware backends
19
-
* **GPU Acceleration**: Leverages CUDA for high-performance syndrome decoding
14
+
* **Low-Latency Decoding**: Syndrome processing and correction calculation within coherence time constraints.
15
+
* **Streaming Syndrome Interface**: Continuous syndrome enqueueing from quantum circuits.
16
+
* **Multiple Decoder Support**: Concurrent management of multiple logical qubits, each with independent decoder instances.
17
+
* **Flexible Configuration**: YAML-based decoder configuration supporting various decoder types and parameters.
18
+
* **Device-Agnostic API**: Unified API that works across simulation and hardware backends.
19
+
* **GPU Acceleration**: Leverages CUDA for high-performance syndrome decoding.
20
20
21
21
Workflow Overview
22
22
-----------------
@@ -64,7 +64,7 @@ The examples above showcase the main components of the real-time decoding workfl
64
64
65
65
- Decoder finalization: Frees up resources after circuit execution.
66
66
67
-
The API is designed to be called from within quantum kernels (marked with ``@cudaq.kernel`` in Python or ``__qpu__`` in C++). The runtime automatically routes these calls to the appropriate backend—whether a simulation environment on your local machine or a low-latency connection to quantum hardware. The API is device-agnostic, so the same kernel code works across different deployment scenarios.
67
+
The API is designed to be called from within quantum kernels (marked with ``@cudaq.kernel`` in Python or ``__qpu__`` in C++). The runtime automatically routes these calls to the appropriate backend—whether a simulation environment on the local machine or a low-latency connection to quantum hardware. The API is device-agnostic, so the same kernel code works across different deployment scenarios.
68
68
69
69
The user is required to provide a configuration file or generate one if it is not present. The generation process depends on the decoder type and the detector error model studied in other sections of the documentation. Moreover, the user must write an appropriate kernel that describes the correct syndrome extraction and correction application logic.
70
70
@@ -250,7 +250,7 @@ With decoders configured and initialized, they can be used within quantum kernel
250
250
251
251
These functions are designed to be called from within quantum kernels (marked with ``@cudaq.kernel`` in Python or ``__qpu__`` in C++). The runtime automatically routes these calls to the appropriate backend - whether that is a simulation environment on the local machine or a low-latency connection to quantum hardware. The API is device-agnostic, so the same kernel code works across different deployment scenarios.
252
252
253
-
The typical pattern is: reset the decoder at the start of each shot, enqueue syndromes after each stabilizer measurement round, then get corrections before measuring the logical observables. Decoders process syndromes asynchronously, so by the time ``get_corrections`` is called, the decoder has usually finished its analysis. If decoding takes longer than expected, ``get_corrections`` will block until results are available.
253
+
The typical procedure is: reset the decoder at the start of each shot, enqueue syndromes after each stabilizer measurement round, then get corrections before measuring the logical observables. Decoders process syndromes asynchronously, so by the time ``get_corrections`` is called, the decoder has usually finished its analysis. If decoding takes longer than expected, ``get_corrections`` will block until results are available.
254
254
255
255
Here is how to use the real-time decoding API in quantum kernels:
256
256
@@ -363,7 +363,7 @@ For most practical scenarios with distance-5 to distance-9 codes and error rates
363
363
This decoder works well up to moderate code distances because the lookup table size scales combinatorially with the number of error locations and the error depth. Beyond distance 9, or when higher error rates need to be handled, belief propagation decoders like the NV-QLDPC decoder should be considered.
364
364
365
365
* **Best for**: Small to medium codes (distance 5-9), moderate error rates (0.1-1%), good balance of speed and accuracy
366
-
* **Parameters**:
366
+
* **Configuration Parameters**:
367
367
368
368
* ``lut_error_depth`` (int): Maximum number of simultaneous errors to consider (typically 2-3). Higher values improve accuracy but increase memory usage.
369
369
@@ -393,7 +393,7 @@ This decoder excels when working with codes beyond distance 9, where lookup tabl
393
393
The decoder offers extensive tunability. The number of BP iterations can be adjusted to trade off latency for accuracy, the user can choose between sum-product and min-sum BP variants, and OSD search depth can be controlled. For real-time applications, conservative settings (50 iterations, OSD order 7) are a good starting point, with tuning based on observed error rates and latency requirements.
394
394
395
395
* **Best for**: Medium to large codes (distance ≥ 7), moderate to high error rates, scenarios where GPU acceleration is available
396
-
* **Key Parameters**:
396
+
* **Configuration Parameters**:
397
397
398
398
* ``error_rate_vec`` (list/vector of floats): Per-mechanism error probabilities - crucial for BP convergence. These should match the DEM's error rates.
399
399
* ``max_iterations`` (int): Maximum BP iterations (typically 50-100). More iterations improve accuracy but increase latency.
@@ -432,7 +432,7 @@ and then combining the results to form a global correction.
432
432
This approach reduces memory and computational requirements while still capturing most local error correlations.
433
433
434
434
* **Best for**: Very long circuits, memory-constrained systems
435
-
* **Parameters**:
435
+
* **ConfigurationParameters**:
436
436
437
437
* ``window_size``: Number of rounds per window
438
438
* ``step_size``: Window advancement (equals window_size for non-overlapping)
@@ -782,7 +782,7 @@ Given that the user follows the structure of the examples provided, where each e
The installation can be verified with this minimal test:
835
-
836
-
.. tab:: Python
837
-
838
-
.. code-block:: python
839
-
840
-
import os
841
-
os.environ["CUDAQ_DEFAULT_SIMULATOR"] ="stim"
842
-
843
-
import cudaq
844
-
import cudaq_qec as qec
845
-
846
-
# Test decoder configuration
847
-
print("Testing real-time decoding setup...")
848
-
849
-
# Create minimal decoder config
850
-
config = qec.decoder_config()
851
-
config.id =0
852
-
config.type ="multi_error_lut"
853
-
config.block_size =10
854
-
config.syndrome_size =5
855
-
config.H_sparse = [0, 1, -1, 1, 2, -1] # Minimal test data
856
-
config.O_sparse = [0, -1]
857
-
config.D_sparse = [0, -1]
858
-
859
-
lut_config = qec.multi_error_lut_config()
860
-
lut_config.lut_error_depth =1
861
-
config.set_decoder_custom_args(lut_config)
862
-
863
-
multi_config = qec.multi_decoder_config()
864
-
multi_config.decoders = [config]
865
-
866
-
status = qec.configure_decoders(multi_config)
867
-
print(f"Configuration status: {status}")
868
-
869
-
qec.finalize_decoders()
870
-
print("Setup verified!")
871
-
872
-
.. tab:: C++
873
-
874
-
.. code-block:: bash
875
-
876
-
# Compile test
877
-
nvq++ --target stim test_setup.cpp \
878
-
-lcudaq-qec \
879
-
-lcudaq-qec-realtime-decoding \
880
-
-lcudaq-qec-realtime-decoding-simulation
881
-
882
-
# Run
883
-
./a.out
884
-
885
-
If the test completes without errors, the setup is ready for real-time decoding experiments.
886
831
887
832
Best Practices
888
833
--------------
@@ -891,6 +836,7 @@ Successfully deploying real-time decoding requires attention to several key deta
891
836
892
837
Decoder Selection
893
838
^^^^^^^^^^^^^^^^^
839
+
The page `CUDA-Q QEC Decoders <https://nvidia.github.io/cudaqx/components/qec/introduction.html#pre-built-qec-decoders>`_ provides initial guidance on how to choose the right decoder for the target application.
894
840
895
841
Choosing the right decoder is crucial for balancing accuracy, latency, and resource usage. The decision depends on multiple factors: the quantum code's distance, expected physical error rates, available computational resources, and latency requirements. This table provides initial guidance, but validation with the specific workload is always recommended:
0 commit comments