Skip to content

Commit 7cee7e9

Browse files
committed
issue: 4043157 Set initial RTO to 1 second per RFC 6298
Change initial TCP retransmission timeout from 3 seconds to 1 second as recommended by RFC 6298 Section 2, with additional safeguards for timer granularity and RFC 1122 compliance. RFC 6298 states: "Until a round-trip time (RTT) measurement has been made... the sender SHOULD set RTO <- 1 second" This applies to all new TCP connections until an RTT measurement is made. Changes: 1. TCP RTO Calculation (src/core/lwip/tcp.c): - Added get_initial_rto() helper function - Uses round-up division - Prevents division trunc that could cause premature timeouts - Modified tcp_pcb_init() and tcp_pcb_recycle() to use helper 2. Timer Resolution Limits (src/core/util/sys_vars.cpp): - Added RFC 1122 validation for tcp_timer_resolution_msec - Enforces maximum of 500ms per RFC 1122 Section 4.2.3.2 - Logs warning and clamps value if exceeded - Applied to both environment variable and config registry paths 3. Configuration Schema (xlio_config_schema.json): - Added "maximum": 500 constraint to timer_msec - Updated description to reference RFC 1122 requirement - Prevents invalid configurations at schema validation level 4. Documentation (README): - Added RFC 1122 reference to timer_msec documentation Rationale: Previous implementation used simple division (1000 / slow_tmr_interval) which could result in: - 0 ticks when interval > 1000ms (immediate timeout) - 1 tick when interval = 1000ms (fires on next tick) - Insufficient granularity with large timer intervals The new implementation provides defense-in-depth: - Schema validation prevents misconfiguration - Runtime validation enforces RFC 1122 limits (delayed ACK ≤ 500ms) - Round-up division ensures minimum 1 tick without truncation This fixes incorrect SYN retransmission timing where the first retry could occur prematurely or immediately, causing TCP_USER_TIMEOUT tests to fail and connection establishment issues. Benefits: - TCP/IP standards compliance (RFC 6298 and RFC 1122) - Robust handling of timer granularity edge cases - Faster connection establishment failure detection - Prevents problematic timer configurations - Matches Linux kernel TCP_TIMEOUT_INIT behavior Signed-off-by: Tomer Cabouly <[email protected]>
1 parent 6689188 commit 7cee7e9

File tree

4 files changed

+34
-5
lines changed

4 files changed

+34
-5
lines changed

README

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -693,6 +693,7 @@ Maps to **XLIO_TCP_TIMER_RESOLUTION_MSEC** environment variable.
693693
Control internal TCP timer resolution (fast timer) in milliseconds.
694694
Minimum value is the thread wakeup timer resolution configured in
695695
performance.threading.internal_handler.timer_msec.
696+
Maximum is 500ms per RFC 1122 Section 4.2.3.2 (delayed ACK timer must not exceed 500ms).
696697
Default value is 100
697698

698699
network.protocols.tcp.timestamps

src/core/config/descriptor_providers/xlio_config_schema.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -418,9 +418,10 @@
418418
"timer_msec": {
419419
"type": "integer",
420420
"minimum": 0,
421+
"maximum": 500,
421422
"default": 100,
422423
"title": "TCP timer interval (msec)",
423-
"description": "Maps to XLIO_TCP_TIMER_RESOLUTION_MSEC environment variable.\nControl internal TCP timer resolution (fast timer) in milliseconds.\nMinimum value is the thread wakeup timer resolution configured in\nperformance.threading.internal_handler.timer_msec."
424+
"description": "Maps to XLIO_TCP_TIMER_RESOLUTION_MSEC environment variable.\nControl internal TCP timer resolution (fast timer) in milliseconds.\nMinimum value is the thread wakeup timer resolution configured in\nperformance.threading.internal_handler.timer_msec.\nMaximum is 500ms per RFC 1122 Section 4.2.3.2 (delayed ACK timer must not exceed 500ms)."
424425
},
425426
"mss": {
426427
"type": "integer",

src/core/lwip/tcp.c

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -905,6 +905,11 @@ err_t tcp_recv_null(void *arg, struct tcp_pcb *pcb, struct pbuf *p, err_t err)
905905
return ERR_OK;
906906
}
907907

908+
static inline u32_t get_initial_rto(void)
909+
{
910+
return (1000 + slow_tmr_interval - 1) / slow_tmr_interval;
911+
}
912+
908913
void tcp_pcb_init(struct tcp_pcb *pcb, u8_t prio, void *container)
909914
{
910915
u32_t iss;
@@ -927,9 +932,10 @@ void tcp_pcb_init(struct tcp_pcb *pcb, u8_t prio, void *container)
927932
pcb->mss = pcb->advtsd_mss;
928933
pcb->user_timeout_ms = 0;
929934
pcb->ticks_since_data_sent = -1;
930-
pcb->rto = 3000 / slow_tmr_interval;
935+
// Set initial RTO to 1 second as per RFC 6298
936+
pcb->rto = get_initial_rto();
931937
pcb->sa = 0;
932-
pcb->sv = 3000 / slow_tmr_interval;
938+
pcb->sv = get_initial_rto();
933939
pcb->rtime = -1;
934940
#if TCP_CC_ALGO_MOD
935941
switch (lwip_cc_algo_module) {
@@ -985,9 +991,10 @@ void tcp_pcb_recycle(struct tcp_pcb *pcb)
985991
pcb->flags = 0;
986992
pcb->user_timeout_ms = 0;
987993
pcb->ticks_since_data_sent = -1;
988-
pcb->rto = 3000 / slow_tmr_interval;
994+
// Set initial RTO to 1 second as per RFC 6298
995+
pcb->rto = get_initial_rto();
989996
pcb->sa = 0;
990-
pcb->sv = 3000 / slow_tmr_interval;
997+
pcb->sv = get_initial_rto();
991998
pcb->nrtx = 0;
992999
pcb->dupacks = 0;
9931000
pcb->rtime = -1;

src/core/util/sys_vars.cpp

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1621,6 +1621,16 @@ void mce_sys_var::get_env_params()
16211621
tcp_timer_resolution_msec = timer_resolution_msec;
16221622
}
16231623

1624+
// RFC 1122 Section 4.2.3.2: Delayed ACK timer must not exceed 500ms
1625+
// This limits TCP timer resolution to ensure protocol compliance and proper RTO calculations
1626+
if (tcp_timer_resolution_msec > 500) {
1627+
vlog_printf(VLOG_WARNING,
1628+
"TCP timer resolution [%s=%d] exceeds RFC 1122 maximum of 500ms. "
1629+
"Clamping to 500ms to ensure protocol compliance.\n",
1630+
SYS_VAR_TCP_TIMER_RESOLUTION_MSEC, tcp_timer_resolution_msec);
1631+
tcp_timer_resolution_msec = 500;
1632+
}
1633+
16241634
if ((env_ptr = getenv(SYS_VAR_INTERNAL_THREAD_CPUSET))) {
16251635
snprintf(internal_thread_cpuset, FILENAME_MAX, "%s", env_ptr);
16261636
}
@@ -2744,6 +2754,16 @@ void mce_sys_var::configure_memory_limits(const config_registry &registry)
27442754
tcp_timer_resolution_msec = timer_resolution_msec;
27452755
}
27462756

2757+
// RFC 1122 Section 4.2.3.2: Delayed ACK timer must not exceed 500ms
2758+
// This limits TCP timer resolution to ensure protocol compliance and proper RTO calculations
2759+
if (tcp_timer_resolution_msec > 500) {
2760+
vlog_printf(VLOG_WARNING,
2761+
"TCP timer resolution [%s=%d] exceeds RFC 1122 maximum of 500ms. "
2762+
"Clamping to 500ms to ensure protocol compliance.\n",
2763+
SYS_VAR_TCP_TIMER_RESOLUTION_MSEC, tcp_timer_resolution_msec);
2764+
tcp_timer_resolution_msec = 500;
2765+
}
2766+
27472767
if (registry.value_exists("performance.threading.cpuset")) {
27482768
snprintf(internal_thread_cpuset, FILENAME_MAX, "%s",
27492769
registry.get_value<std::string>("performance.threading.cpuset").c_str());

0 commit comments

Comments
 (0)