Skip to content

Conversation

@AlexanderGrissik
Copy link
Collaborator

Description

Based on PR 321.
Correct special WQEs cleanup.

What

Fix possible issues and syndroms do to incorrect WQEBB cleanup.

Why ?

Functionality.

How ?

Always nullify the whole first WQEBB

Change type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Tests
  • Other

Check list

  • Code follows the style de facto guidelines of this project
  • Comments have been inserted in hard to understand places
  • Documentation has been updated (if necessary)
  • Test has been added (if possible)

XLIO does a prefetch with zeroing the next WQEBB after a doorbell. It
also presets inline_hdr_sz as a legacy behavior.

However, in a corner case when the producer fills the tail WQE and the
consumer is still processing the head WQE, the prefetch code corrupts
the head WQE.

Remove the prefetch to avoid the SQ corruption. Clear each WQE before
filling it.

Signed-off-by: Dmytro Podgornyi <[email protected]>
@AlexanderGrissik AlexanderGrissik requested a review from pasis March 20, 2025 13:39
@galnoam
Copy link
Collaborator

galnoam commented Mar 24, 2025

bot:retest

@galnoam galnoam added the postponed Postponed for further decisions label Mar 24, 2025
@galnoam
Copy link
Collaborator

galnoam commented Apr 14, 2025

@pasis, can review when you have free time.

auto *cseg = wqebb_get<xlio_mlx5_wqe_ctrl_seg *>(0U);
auto *ucseg = wqebb_get<xlio_mlx5_wqe_umr_ctrl_seg *>(0U, sizeof(*cseg));

memset(cseg, 0, sizeof(mlx5_eth_wqe));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is extra memset, since nvme_fill_static_params_control() clears all 64 bytes inside. Hiding memset inside the function seems confusing, so I'd prefer to either remove the extra memset or improve the solution consistently.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

void hw_queue_tx::nvme_set_progress_context(xlio_tis *tis, uint32_t tcp_seqno)
{
auto *wqe = reinterpret_cast<mlx5e_set_nvmeotcp_progress_params_wqe *>(m_sq_wqe_hot);
memset(wqe, 0, sizeof(mlx5_eth_wqe));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly to the nvme_set_static_context() comment, nvme_fill_progress_wqe() clears the wqe inside. However, in this case, the wqe is 32 bytes. I'd suggest to:

  • remove the memset inside the nested function
  • replace sizeof(mlx5_eth_wqe) with WQEBB constant. Ethernet wqe can be confusing in case of non-send operations. For more readability, this can even be constexpr max<size_t>(WQEBB, sizeof(struct_name)) - it is expected to be optimized in compile time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

is_tx ? MLX5_OPC_MOD_TLS_TIS_PROGRESS_PARAMS : MLX5_OPC_MOD_TLS_TIR_PROGRESS_PARAMS;

memset(wqe, 0, sizeof(*wqe));
memset(wqe, 0, sizeof(mlx5_eth_wqe));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace sizeof(mlx5_eth_wqe) with either WQEBB or max<size_t>(WQEBB, sizeof(mlx5_set_tls_progress_params_wqe)). See comment above. Repeat the same for all similar places.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Nullify all WQEBB size even if the special WQE is less then 64B.

Signed-off-by: Alexander Grissik <[email protected]>
@AlexanderGrissik AlexanderGrissik removed the postponed Postponed for further decisions label May 5, 2025
@AlexanderGrissik
Copy link
Collaborator Author

bot:retest

@galnoam
Copy link
Collaborator

galnoam commented May 8, 2025

/review

@pr-review-bot-app
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

The use of std::max<size_t> in the memset calls (e.g., lines 974-975, 980, 983, etc.) should be reviewed to ensure that the size calculations are correct and do not introduce unintended behavior or memory corruption.

memset(cseg, 0,
       std::max<size_t>(WQEBB,
                        sizeof(xlio_mlx5_wqe_ctrl_seg) + sizeof(xlio_mlx5_wqe_umr_ctrl_seg)));

nvme_fill_static_params_control(cseg, ucseg, m_sq_wqe_counter, m_mlx5_qp.qpn, tis->get_tisn(),
                                0);
memset(wqebb_get<void *>(1U), 0, std::max<size_t>(WQEBB, sizeof(mlx5_mkey_seg)));

auto *params = wqebb_get<mlx5_wqe_transport_static_params_seg *>(2U);
memset(params, 0, std::max<size_t>(WQEBB, sizeof(mlx5_wqe_transport_static_params_seg)));
nvme_fill_static_params_transport_params(params, config);
Performance Concern

The frequent use of memset to zero out memory (e.g., lines 845, 993, 1351, etc.) may have performance implications. Consider whether this can be optimized or avoided in performance-critical paths.

    memset(m_sq_wqe_hot, 0, sizeof(mlx5_eth_wqe));

    /* Configure ctrl segment
     * qpn_ds or ctrl.data[1] is set inside fill_wqe()
     */
    ctrl->opmod_idx_opcode = htonl(((m_sq_wqe_counter & 0xffff) << 8) |
                                   (get_mlx5_opcode(xlio_send_wr_opcode(*p_send_wqe)) & 0xff));
    m_sq_wqe_hot->ctrl.data[2] = 0;
    ctrl->fm_ce_se = (request_comp ? (uint8_t)MLX5_WQE_CTRL_CQ_UPDATE : 0);
    ctrl->tis_tir_num = htobe32(tisn << 8);

    /* Configure eth segment
     * reset rsvd0, cs_flags, rsvd1, mss and rsvd2 fields
     * checksum flags are set here
     */
    *((uint64_t *)eseg) = 0;
    eseg->rsvd2 = 0;
    eseg->cs_flags = (uint8_t)(attr & (XLIO_TX_PACKET_L3_CSUM | XLIO_TX_PACKET_L4_CSUM) & 0xff);

    /* Store buffer descriptor */
    store_current_wqe_prop(reinterpret_cast<mem_buf_desc_t *>(p_send_wqe->wr_id), credits, tis);

    /* Complete WQE */
    int wqebbs = fill_wqe(p_send_wqe);
    assert(wqebbs > 0 && (unsigned)wqebbs <= credits);
    NOT_IN_USE(wqebbs);

    update_next_wqe_hot();

    hwqtx_logfunc(
        "m_sq_wqe_hot: %p m_sq_wqe_hot_index: %d wqe_counter: %d new_hot_index: %d wr_id: %llx",
        m_sq_wqe_hot, m_sq_wqe_hot_index, m_sq_wqe_counter, (m_sq_wqe_counter & (m_tx_num_wr - 1)),
        p_send_wqe->wr_id);
}

std::unique_ptr<xlio_tis> hw_queue_tx::create_tis(uint32_t flags)
{
    dpcp::adapter *adapter = m_p_ib_ctx_handler->get_dpcp_adapter();
    bool is_tls = flags & dpcp::TIS_ATTR_TLS, is_nvme = flags & dpcp::TIS_ATTR_NVMEOTCP;
    if (unlikely(!adapter || (is_tls && is_nvme))) {
        return nullptr;
    }

    dpcp::tis::attr tis_attr = {
        .flags = flags,
        .tls_en = is_tls,
        .nvmeotcp = is_nvme,
        .transport_domain = adapter->get_td(),
        .pd = adapter->get_pd(),
    };

    dpcp::tis *dpcp_tis = nullptr;
    if (unlikely(adapter->create_tis(tis_attr, dpcp_tis) != dpcp::DPCP_OK)) {
        hwqtx_logerr("Failed to create TIS with NVME enabled");
        return nullptr;
    }

    auto tis_type = is_tls ? xlio_ti::ti_type::TLS_TIS : xlio_ti::ti_type::NVME_TIS;
    return std::make_unique<xlio_tis>(this, std::unique_ptr<dpcp::tis>(dpcp_tis), tis_type);
}

static inline void nvme_fill_static_params_control(xlio_mlx5_wqe_ctrl_seg *cseg,
                                                   xlio_mlx5_wqe_umr_ctrl_seg *ucseg,
                                                   uint32_t producer_index, uint32_t qpn,
                                                   uint32_t tisn, uint8_t fence_flags)
{
    cseg->opmod_idx_opcode =
        htobe32(((producer_index & 0xffff) << 8) | MLX5_OPCODE_UMR |
                (MLX5_CTRL_SEGMENT_OPC_MOD_UMR_NVMEOTCP_TIS_STATIC_PARAMS << 24));
    size_t num_wqe_ds = 12U;
    cseg->qpn_ds = htobe32((qpn << MLX5_WQE_CTRL_QPN_SHIFT) | num_wqe_ds);
    cseg->fm_ce_se = fence_flags;
    cseg->tis_tir_num = htobe32(tisn << MLX5_WQE_CTRL_TIR_TIS_INDEX_SHIFT);

    ucseg->flags = MLX5_UMR_INLINE;
    ucseg->bsf_octowords = htobe16(MLX5E_TRANSPORT_STATIC_PARAMS_OCTWORD_SIZE);
}

static inline void nvme_fill_static_params_transport_params(
    mlx5_wqe_transport_static_params_seg *params, uint32_t config)

{
    void *ctx = params->ctx;

    DEVX_SET(transport_static_params, ctx, const_1, 1);
    DEVX_SET(transport_static_params, ctx, const_2, 2);
    DEVX_SET(transport_static_params, ctx, acc_type, MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_NVMETCP);
    DEVX_SET(transport_static_params, ctx, nvme_resync_tcp_sn, 0);
    DEVX_SET(transport_static_params, ctx, pda, static_cast<uint8_t>(config & XLIO_NVME_PDA_MASK));
    DEVX_SET(transport_static_params, ctx, ddgst_en, bool(config & XLIO_NVME_DDGST_ENABLE));
    DEVX_SET(transport_static_params, ctx, ddgst_offload_en,
             bool(config & XLIO_NVME_DDGST_OFFLOAD));
    DEVX_SET(transport_static_params, ctx, hddgst_en, bool(config & XLIO_NVME_HDGST_ENABLE));
    DEVX_SET(transport_static_params, ctx, hdgst_offload_en,
             bool(config & XLIO_NVME_HDGST_OFFLOAD));
    DEVX_SET(transport_static_params, ctx, ti, MLX5_TRANSPORT_STATIC_PARAMS_TI_INITIATOR);
    DEVX_SET(transport_static_params, ctx, const1, 1);
    DEVX_SET(transport_static_params, ctx, zero_copy_en, 0);
}

static inline void nvme_fill_progress_wqe(mlx5e_set_nvmeotcp_progress_params_wqe *wqe,
                                          uint32_t producer_index, uint32_t qpn, uint32_t tisn,
                                          uint32_t tcp_seqno, uint8_t fence_flags)
{
    auto cseg = &wqe->ctrl.ctrl;

    size_t progres_params_ds = DIV_ROUND_UP(sizeof(*wqe), MLX5_SEND_WQE_DS);
    cseg->opmod_idx_opcode =
        htobe32(((producer_index & 0xffff) << 8) | XLIO_MLX5_OPCODE_SET_PSV |
                (MLX5_CTRL_SEGMENT_OPC_MOD_UMR_NVMEOTCP_TIS_PROGRESS_PARAMS << 24));
    cseg->qpn_ds = htobe32((qpn << MLX5_WQE_CTRL_QPN_SHIFT) | progres_params_ds);
    cseg->fm_ce_se = fence_flags;

    mlx5_seg_nvmeotcp_progress_params *params = &wqe->params;
    params->tir_num = htobe32(tisn);
    void *ctx = params->ctx;

    DEVX_SET(nvmeotcp_progress_params, ctx, next_pdu_tcp_sn, tcp_seqno);
    DEVX_SET(nvmeotcp_progress_params, ctx, pdu_tracker_state,
             MLX5E_NVMEOTCP_PROGRESS_PARAMS_PDU_TRACKER_STATE_START);
    /* if (is_tx) offloading state == 0*/
    DEVX_SET(nvmeotcp_progress_params, ctx, offloading_state, 0);
}

void hw_queue_tx::nvme_set_static_context(xlio_tis *tis, uint32_t config)
{
    auto *cseg = wqebb_get<xlio_mlx5_wqe_ctrl_seg *>(0U);
    auto *ucseg = wqebb_get<xlio_mlx5_wqe_umr_ctrl_seg *>(0U, sizeof(*cseg));

    memset(cseg, 0,
           std::max<size_t>(WQEBB,
                            sizeof(xlio_mlx5_wqe_ctrl_seg) + sizeof(xlio_mlx5_wqe_umr_ctrl_seg)));

    nvme_fill_static_params_control(cseg, ucseg, m_sq_wqe_counter, m_mlx5_qp.qpn, tis->get_tisn(),
                                    0);
    memset(wqebb_get<void *>(1U), 0, std::max<size_t>(WQEBB, sizeof(mlx5_mkey_seg)));

    auto *params = wqebb_get<mlx5_wqe_transport_static_params_seg *>(2U);
    memset(params, 0, std::max<size_t>(WQEBB, sizeof(mlx5_wqe_transport_static_params_seg)));
    nvme_fill_static_params_transport_params(params, config);
    store_current_wqe_prop(nullptr, SQ_CREDITS_UMR, tis);
    ring_doorbell(MLX5E_TRANSPORT_SET_STATIC_PARAMS_WQEBBS);
    update_next_wqe_hot();
}

void hw_queue_tx::nvme_set_progress_context(xlio_tis *tis, uint32_t tcp_seqno)
{
    auto *wqe = reinterpret_cast<mlx5e_set_nvmeotcp_progress_params_wqe *>(m_sq_wqe_hot);
    memset(wqe, 0, std::max<size_t>(WQEBB, sizeof(mlx5e_set_nvmeotcp_progress_params_wqe)));
    nvme_fill_progress_wqe(wqe, m_sq_wqe_counter, m_mlx5_qp.qpn, tis->get_tisn(), tcp_seqno,
                           MLX5_FENCE_MODE_INITIATOR_SMALL);
    store_current_wqe_prop(nullptr, SQ_CREDITS_SET_PSV, tis);
    ring_doorbell(MLX5E_NVMEOTCP_PROGRESS_PARAMS_WQEBBS);
    update_next_wqe_hot();
}

#if defined(DEFINED_UTLS)
std::unique_ptr<dpcp::tls_dek> hw_queue_tx::get_new_tls_dek(const void *key,
                                                            uint32_t key_size_bytes)
{
    dpcp::tls_dek *_dek = nullptr;
    dpcp::adapter *adapter = m_p_ib_ctx_handler->get_dpcp_adapter();
    if (likely(adapter)) {
        dpcp::status status;
        struct dpcp::dek_attr dek_attr;
        memset(&dek_attr, 0, sizeof(dek_attr));
        dek_attr.key_blob = (void *)key;
        dek_attr.key_blob_size = key_size_bytes;
        dek_attr.key_size = key_size_bytes;
        dek_attr.pd_id = adapter->get_pd();
        status = adapter->create_tls_dek(dek_attr, _dek);
        if (unlikely(status != dpcp::DPCP_OK)) {
            hwqtx_logwarn("Failed to create new DEK, status: %d", status);
            if (_dek) {
                delete _dek;
                _dek = nullptr;
            }
        }
    }

    return std::unique_ptr<dpcp::tls_dek>(_dek);
}

std::unique_ptr<dpcp::tls_dek> hw_queue_tx::get_tls_dek(const void *key, uint32_t key_size_bytes)
{
    dpcp::status status;
    dpcp::adapter *adapter = m_p_ib_ctx_handler->get_dpcp_adapter();

    if (unlikely(!adapter)) {
        return std::unique_ptr<dpcp::tls_dek>(nullptr);
    }

    // If the amount of available DEKs in m_dek_put_cache is smaller than
    // low-watermark we continue to create new DEKs. This is to avoid situations
    // where one DEKs is returned and then fetched in a throttlling manner
    // causing too frequent crypto-sync.
    // It is also possible that crypto-sync may have higher impact with higher number
    // of active connections.
    if (unlikely(!m_p_ring->tls_sync_dek_supported()) ||
        (unlikely(m_tls_dek_get_cache.empty()) &&
         (m_tls_dek_put_cache.size() <= safe_mce_sys().utls_low_wmark_dek_cache_size))) {
        return get_new_tls_dek(key, key_size_bytes);
    }

    if (unlikely(m_tls_dek_get_cache.empty())) {
        hwqtx_logdbg("Empty DEK get cache. Swapping caches and do Sync-Crypto. Put-Cache size: %zu",
                     m_tls_dek_put_cache.size());

        status = adapter->sync_crypto_tls();
        if (unlikely(status != dpcp::DPCP_OK)) {
            hwqtx_logwarn("Failed to flush DEK HW cache, status: %d", status);
            return get_new_tls_dek(key, key_size_bytes);
        }

        m_tls_dek_get_cache.swap(m_tls_dek_put_cache);
    }

    std::unique_ptr<dpcp::tls_dek> out_dek(std::move(m_tls_dek_get_cache.front()));
    m_tls_dek_get_cache.pop_front();

    struct dpcp::dek_attr dek_attr;
    memset(&dek_attr, 0, sizeof(dek_attr));
    dek_attr.key_blob = const_cast<void *>(key);
    dek_attr.key_blob_size = key_size_bytes;
    dek_attr.key_size = key_size_bytes;
    dek_attr.pd_id = adapter->get_pd();
    status = out_dek->modify(dek_attr);
    if (unlikely(status != dpcp::DPCP_OK)) {
        hwqtx_logwarn("Failed to modify DEK, status: %d", status);
        out_dek.reset(nullptr);
    }

    return out_dek;
}

void hw_queue_tx::put_tls_dek(std::unique_ptr<dpcp::tls_dek> &&tls_dek_obj)
{
    if (!tls_dek_obj) {
        return;
    }
    // We don't allow unlimited DEK cache to avoid system DEK starvation.
    if (likely(m_p_ring->tls_sync_dek_supported()) &&
        m_tls_dek_put_cache.size() < safe_mce_sys().utls_high_wmark_dek_cache_size) {
        m_tls_dek_put_cache.emplace_back(std::forward<std::unique_ptr<dpcp::tls_dek>>(tls_dek_obj));
    }
}

xlio_tis *hw_queue_tx::tls_context_setup_tx(const xlio_tls_info *info)
{
    std::unique_ptr<xlio_tis> tis;
    if (m_tls_tis_cache.empty()) {
        tis = create_tis(DPCP_TIS_FLAGS | dpcp::TIS_ATTR_TLS);
        if (unlikely(!tis)) {
            return nullptr;
        }
    } else {
        tis.reset(m_tls_tis_cache.back());
        m_tls_tis_cache.pop_back();
    }

    auto dek_obj = get_tls_dek(info->key, info->key_len);
    if (unlikely(!dek_obj)) {
        m_tls_tis_cache.push_back(tis.release());
        return nullptr;
    }

    tis->assign_dek(std::move(dek_obj));
    uint32_t tisn = tis->get_tisn();

    tls_post_static_params_wqe(tis.get(), info, tisn, tis->get_dek_id(), 0, false, true);
    tls_post_progress_params_wqe(tis.get(), tisn, 0, false, true);
    /* The 1st post after TLS configuration must be with fence. */
    m_b_fence_needed = true;

    assert(!tis->m_released);

    return tis.release();
}

void hw_queue_tx::tls_context_resync_tx(const xlio_tls_info *info, xlio_tis *tis, bool skip_static)
{
    uint32_t tisn = tis->get_tisn();

    if (!skip_static) {
        tls_post_static_params_wqe(tis, info, tisn, tis->get_dek_id(), 0, true, true);
    }
    tls_post_progress_params_wqe(tis, tisn, 0, skip_static, true);
    m_b_fence_needed = true;
}

int hw_queue_tx::tls_context_setup_rx(xlio_tir *tir, const xlio_tls_info *info,
                                      uint32_t next_record_tcp_sn, xlio_comp_cb_t callback,
                                      void *callback_arg)
{
    uint32_t tirn;
    dpcp::tls_dek *_dek;
    dpcp::status status;
    dpcp::adapter *adapter = m_p_ib_ctx_handler->get_dpcp_adapter();
    struct dpcp::dek_attr dek_attr;

    memset(&dek_attr, 0, sizeof(dek_attr));
    dek_attr.key_blob = (void *)info->key;
    dek_attr.key_blob_size = info->key_len;
    dek_attr.key_size = info->key_len;
    dek_attr.pd_id = adapter->get_pd();
    status = adapter->create_tls_dek(dek_attr, _dek);
    if (unlikely(status != dpcp::DPCP_OK)) {
        hwqtx_logerr("Failed to create DEK, status: %d", status);
        return -1;
    }
    tir->assign_dek(_dek);
    tir->assign_callback(callback, callback_arg);
    tirn = tir->get_tirn();

    tls_post_static_params_wqe(NULL, info, tirn, _dek->get_key_id(), 0, false, false);
    tls_post_progress_params_wqe(tir, tirn, next_record_tcp_sn, false, false);

    assert(!tir->m_released);

    return 0;
}

void hw_queue_tx::tls_resync_rx(xlio_tir *tir, const xlio_tls_info *info, uint32_t hw_resync_tcp_sn)
{
    tls_post_static_params_wqe(tir, info, tir->get_tirn(), tir->get_dek_id(), hw_resync_tcp_sn,
                               false, false);
}

void hw_queue_tx::tls_get_progress_params_rx(xlio_tir *tir, void *buf, uint32_t lkey)
{
    /* Address must be aligned by 64. */
    assert((uintptr_t)buf == ((uintptr_t)buf >> 6U << 6U));

    tls_get_progress_params_wqe(tir, tir->get_tirn(), buf, lkey);
}

inline void hw_queue_tx::tls_fill_static_params_wqe(struct mlx5_wqe_tls_static_params_seg *params,
                                                    const struct xlio_tls_info *info,
                                                    uint32_t key_id, uint32_t resync_tcp_sn)
{
    unsigned char *initial_rn, *iv;
    uint8_t tls_version;
    uint8_t *ctx;

    ctx = params->ctx;

    iv = DEVX_ADDR_OF(tls_static_params, ctx, gcm_iv);
    initial_rn = DEVX_ADDR_OF(tls_static_params, ctx, initial_record_number);

    memcpy(iv, info->salt, TLS_AES_GCM_SALT_LEN);
    memcpy(initial_rn, info->rec_seq, TLS_AES_GCM_REC_SEQ_LEN);
    if (info->tls_version == TLS_1_3_VERSION) {
        iv = DEVX_ADDR_OF(tls_static_params, ctx, implicit_iv);
        memcpy(iv, info->iv, TLS_AES_GCM_IV_LEN);
    }

    tls_version = (info->tls_version == TLS_1_2_VERSION) ? MLX5E_STATIC_PARAMS_CONTEXT_TLS_1_2
                                                         : MLX5E_STATIC_PARAMS_CONTEXT_TLS_1_3;

    DEVX_SET(tls_static_params, ctx, tls_version, tls_version);
    DEVX_SET(tls_static_params, ctx, const_1, 1);
    DEVX_SET(tls_static_params, ctx, const_2, 2);
    DEVX_SET(tls_static_params, ctx, encryption_standard, MLX5E_ENCRYPTION_STANDARD_TLS);
    DEVX_SET(tls_static_params, ctx, resync_tcp_sn, resync_tcp_sn);
    DEVX_SET(tls_static_params, ctx, dek_index, key_id);
}

inline void hw_queue_tx::tls_post_static_params_wqe(xlio_ti *ti, const struct xlio_tls_info *info,
                                                    uint32_t tis_tir_number, uint32_t key_id,
                                                    uint32_t resync_tcp_sn, bool fence, bool is_tx)
{
    struct mlx5_set_tls_static_params_wqe *wqe =
        reinterpret_cast<struct mlx5_set_tls_static_params_wqe *>(m_sq_wqe_hot);
    struct xlio_mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl.ctrl;
    xlio_mlx5_wqe_umr_ctrl_seg *ucseg = &wqe->uctrl;
    struct mlx5_mkey_seg *mkcseg = &wqe->mkc;
    struct mlx5_wqe_tls_static_params_seg *tspseg = &wqe->params;
    uint8_t opmod = is_tx ? MLX5_OPC_MOD_TLS_TIS_STATIC_PARAMS : MLX5_OPC_MOD_TLS_TIR_STATIC_PARAMS;

#define STATIC_PARAMS_DS_CNT DIV_ROUND_UP(sizeof(*wqe), MLX5_SEND_WQE_DS)

    /*
     * SQ wrap around handling information
     *
     * UMR WQE has the size of 3 WQEBBs.
     * The following are segments sizes the WQE contains.
     *
     * UMR WQE segments sizes:
     * sizeof(wqe->ctrl) = 16[B]
     * sizeof(wqe->uctrl) = 48[B]
     * sizeof(wqe->mkc) = 64[B]
     * sizeof(wqe->params) = 64[B]
     *
     * UMR WQEBBs to segments mapping:
     * WQEBB1: [wqe->ctrl(16[B]), wqe->uctrl(48[B])] -> 64[B]
     * WQEBB2: [wqe->mkc(64[B])]                     -> 64[B]
     * WQEBB3: [wqe->params(64[B])]                  -> 64[B]
     *
     * There are 3 cases:
     *     1. There is enough room in the SQ for 3 WQEBBs:
     *        3 WQEBBs posted from m_sq_wqe_hot current location.
     *     2. There is enough room in the SQ for 2 WQEBBs:
     *        2 WQEBBs posted from m_sq_wqe_hot current location till m_sq_wqes_end.
     *        1 WQEBB posted from m_sq_wqes beginning.
     *     3. There is enough room in the SQ for 1 WQEBB:
     *        1 WQEBB posted from m_sq_wqe_hot current location till m_sq_wqes_end.
     *        2 WQEBBs posted from m_sq_wqes beginning.
     * The case of 0 WQEBBs room left in the SQ shouldn't happen, m_sq_wqe_hot wrap around handling
     * done when setting next m_sq_wqe_hot.
     *
     * In all the 3 cases, no need to change cseg and ucseg pointers, since they fit to
     * one WQEBB and will be posted before m_sq_wqes_end.
     */

    memset(m_sq_wqe_hot, 0, sizeof(*m_sq_wqe_hot));
    cseg->opmod_idx_opcode =
        htobe32(((m_sq_wqe_counter & 0xffff) << 8) | MLX5_OPCODE_UMR | (opmod << 24));
    cseg->qpn_ds = htobe32((m_mlx5_qp.qpn << MLX5_WQE_CTRL_QPN_SHIFT) | STATIC_PARAMS_DS_CNT);
    cseg->fm_ce_se = fence ? MLX5_FENCE_MODE_INITIATOR_SMALL : 0;
    cseg->tis_tir_num = htobe32(tis_tir_number << 8);

    ucseg->flags = MLX5_UMR_INLINE;
    ucseg->bsf_octowords = htobe16(DEVX_ST_SZ_BYTES(tls_static_params) / 16);

    int sq_wqebbs_room_left =
        (static_cast<int>(m_sq_wqes_end - reinterpret_cast<uint8_t *>(cseg)) / MLX5_SEND_WQE_BB);

    /* Case 1:
     * In this case we don't need to change
     * the pointers of the different segments, because there is enough room in the SQ.
     * Thus, no need to do special handling.
     */

    if (unlikely(sq_wqebbs_room_left == 2)) { // Case 2: Change tspseg pointer:
        tspseg = reinterpret_cast<struct mlx5_wqe_tls_static_params_seg *>(m_sq_wqes);
    } else if (unlikely(sq_wqebbs_room_left == 1)) { // Case 3: Change mkcseg and tspseg pointers:
        mkcseg = reinterpret_cast<struct mlx5_mkey_seg *>(m_sq_wqes);
        tspseg = reinterpret_cast<struct mlx5_wqe_tls_static_params_seg *>(
            reinterpret_cast<uint8_t *>(m_sq_wqes) + sizeof(*mkcseg));
    }

    memset(mkcseg, 0, sizeof(*mkcseg));
    memset(tspseg, 0, sizeof(*tspseg));

    tls_fill_static_params_wqe(tspseg, info, key_id, resync_tcp_sn);
    store_current_wqe_prop(nullptr, SQ_CREDITS_UMR, ti);

    ring_doorbell(TLS_SET_STATIC_PARAMS_WQEBBS, true);
    dbg_dump_wqe((uint32_t *)m_sq_wqe_hot, sizeof(mlx5_set_tls_static_params_wqe));

    update_next_wqe_hot();
}

inline void hw_queue_tx::tls_fill_progress_params_wqe(
    struct mlx5_wqe_tls_progress_params_seg *params, uint32_t tis_tir_number,
    uint32_t next_record_tcp_sn)
{
    uint8_t *ctx = params->ctx;

    params->tis_tir_num = htobe32(tis_tir_number);

    DEVX_SET(tls_progress_params, ctx, next_record_tcp_sn, next_record_tcp_sn);
    DEVX_SET(tls_progress_params, ctx, record_tracker_state,
             MLX5E_TLS_PROGRESS_PARAMS_RECORD_TRACKER_STATE_START);
    DEVX_SET(tls_progress_params, ctx, auth_state, MLX5E_TLS_PROGRESS_PARAMS_AUTH_STATE_NO_OFFLOAD);
}

inline void hw_queue_tx::tls_post_progress_params_wqe(xlio_ti *ti, uint32_t tis_tir_number,
                                                      uint32_t next_record_tcp_sn, bool fence,
                                                      bool is_tx)
{
    struct mlx5_set_tls_progress_params_wqe *wqe =
        reinterpret_cast<struct mlx5_set_tls_progress_params_wqe *>(m_sq_wqe_hot);
    struct xlio_mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl.ctrl;
    uint8_t opmod =
        is_tx ? MLX5_OPC_MOD_TLS_TIS_PROGRESS_PARAMS : MLX5_OPC_MOD_TLS_TIR_PROGRESS_PARAMS;

    memset(wqe, 0, std::max<size_t>(WQEBB, sizeof(mlx5_set_tls_progress_params_wqe)));

#define PROGRESS_PARAMS_DS_CNT DIV_ROUND_UP(sizeof(*wqe), MLX5_SEND_WQE_DS)

    cseg->opmod_idx_opcode =
        htobe32(((m_sq_wqe_counter & 0xffff) << 8) | XLIO_MLX5_OPCODE_SET_PSV | (opmod << 24));
    cseg->qpn_ds = htobe32((m_mlx5_qp.qpn << MLX5_WQE_CTRL_QPN_SHIFT) | PROGRESS_PARAMS_DS_CNT);
    /* Request completion for TLS RX offload to create TLS rule ASAP. */
    cseg->fm_ce_se =
        (fence ? MLX5_FENCE_MODE_INITIATOR_SMALL : 0) | (is_tx ? 0 : MLX5_WQE_CTRL_CQ_UPDATE);

    tls_fill_progress_params_wqe(&wqe->params, tis_tir_number, next_record_tcp_sn);
    store_current_wqe_prop(nullptr, SQ_CREDITS_SET_PSV, ti);

    ring_doorbell(TLS_SET_PROGRESS_PARAMS_WQEBBS);
    dbg_dump_wqe((uint32_t *)m_sq_wqe_hot, sizeof(mlx5_set_tls_progress_params_wqe));

    update_next_wqe_hot();
}

inline void hw_queue_tx::tls_get_progress_params_wqe(xlio_ti *ti, uint32_t tirn, void *buf,
                                                     uint32_t lkey)
{
    struct mlx5_get_tls_progress_params_wqe *wqe =
        reinterpret_cast<struct mlx5_get_tls_progress_params_wqe *>(m_sq_wqe_hot);
    struct xlio_mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl.ctrl;
    struct xlio_mlx5_seg_get_psv *psv = &wqe->psv;
    uint8_t opmod = MLX5_OPC_MOD_TLS_TIR_PROGRESS_PARAMS;

    memset(wqe, 0, std::max<size_t>(WQEBB, sizeof(mlx5_get_tls_progress_params_wqe)));
Code Clarity

The removal of inline comments explaining the purpose of memset calls (e.g., lines 505-514) reduces code clarity. Consider re-adding comments to maintain readability and understanding of the code.

            m_dm_mgr.allocate_resources(m_p_ib_ctx_handler, m_p_ring->m_p_ring_stat.get());
    }
}

void hw_queue_tx::update_next_wqe_hot()
{
    // Preparing pointer to the next WQE after a doorbell
    m_sq_wqe_hot = &(*m_sq_wqes)[m_sq_wqe_counter & (m_tx_num_wr - 1)];
    m_sq_wqe_hot_index = m_sq_wqe_counter & (m_tx_num_wr - 1);
}

@galnoam galnoam merged commit bb6e200 into Mellanox:vNext May 11, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants