[codex] Harden packed message handling#2
Draft
brothercorvo wants to merge 22 commits intotorlando-tech:feat/t-deckfrom
Draft
[codex] Harden packed message handling#2brothercorvo wants to merge 22 commits intotorlando-tech:feat/t-deckfrom
brothercorvo wants to merge 22 commits intotorlando-tech:feat/t-deckfrom
Conversation
3s is way too short for slow LoRa links, bumped it to 15s.
Static pool arrays (~20 pools) were in BSS segment consuming ~15-25KB of internal RAM. Convert to pointers allocated via heap_caps_aligned_alloc in PSRAM at startup, following the same pattern as Identity::init_known_destinations_pool(). Results: boot heap increased from ~116KB to ~161KB, steady-state max_block improved from 7.6KB to 65KB, skipped announces eliminated. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Subtract ATT_OVERHEAD (3 bytes) from MTU values so fragment sizes match what the BLE stack can actually transmit - Add LONE (0x00) fragment type for single-packet messages, matching Columba's BLE protocol implementation - Increase handshake timeout from 10s to 30s to match Columba - Track consecutive keepalive failures and disconnect after 3 - Add zombie detection for connected peers idle >45s - Add advertising refresh interval constant (60s) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…moryMonitor - Replace Bytes _app_data in IdentityEntry with fixed uint8_t[128] buffer to prevent known destinations from consuming BytesPool tiny slots (was exhausting pool after 3hrs on busy networks) - Move BytesPool storage from BSS to PSRAM, reduce TINY_SLOTS to 1024 now that destinations don't consume pool slots - Defer MemoryMonitor logging from timer callback to main loop poll() to avoid FreeRTOS timer task stack overflow (3120 byte limit) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
save_known_destinations() was called on every announce, writing all 670+ entries to SPIFFS and blocking the main loop for 20+ seconds. Add a dirty flag so saves only happen during the periodic persist_data() timer (~60s). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Link packets were flooding all interfaces (TCP, LoRa, WiFi) because the interface routing check was commented out. This caused massive latency on audio streams — each packet hit LoRa SPI (55ms) even when the link was established over WiFi/AutoInterface. Fix: Use packet.destination_link().attached_interface() instead of the non-existent packet.destination().attached_interface(). Also add the missing Link::attached_interface() const getter implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Validate conn_handle against MAX_CONN_HANDLES before array access in setPeerHandle and promoteToIdentityKeyed to prevent out-of-bounds writes. Add writeCharacteristic() virtual method to IBLEPlatform for targeted GATT characteristic writes (needed for identity handshake). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Atomic save: write to /known_dst.tmp then rename to /known_dst.bin to prevent corruption if a crash occurs mid-write - Fast persist: save within 5s of dirty flag (don't wait for 60s interval) via new should_persist_data() called from main loop - Delete corrupt files: if magic bytes are invalid on load, remove the file so a fresh one can be written - Recover from temp file: if .bin is missing but .tmp exists, rename it (crash happened between write and rename) - Promote load/save logs to INFO for visibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add _persist_yield_callback function pointer, called every 5 entries during save_known_destinations() to feed platform watchdog during slow SPIFFS flash I/O (71+ entries can take 30-50s) - Increase should_persist_data() dirty threshold from 5s to 60s to reduce SPIFFS fragmentation from frequent writes. Exit handler and crash recovery paths still force immediate persist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a `persist` flag to KnownDestinationSlot so only destinations with actual message exchange (contacts) are written to SPIFFS. Network announces stay in the PSRAM pool for routing but don't survive reboots. This reduces persistence time from 40-50s (150+ entries) to <1s (handful of contacts), eliminating the main-loop blocking that caused device unresponsiveness. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…fication Python generates LXMF test vectors (basic, empty, fields, large, unicode, stamp) that C++ unpacks with signature validation. C++ generates vectors that Python unpacks and verifies (signatures, hashes, fields, content). Full pipeline orchestrated by run_interop.sh. Also fixes native17 build: remove #ifdef ARDUINO guard from Bytes std::vector constructor (needed for MsgPack bin_t interop), add src_filter to exclude UI/BLE sources, and init Transport/Identity pools. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nsfer limit Three bugs prevented LXMF propagation node message sync from working: 1. Link::request() BIN-wrapped pre-serialized msgpack data via Bytes::to_msgpack(), causing Python peers to see raw bytes instead of nested structures. Fixed by manually building the packed [timestamp, path_hash, data] array with raw embedded msgpack. 2. Resource responses from Link::request() were not routed to the request callback. The RESOURCE_ADV handler used a generic concluded callback that never called response_resource_concluded(). Fixed handle_resource_concluded() to detect response Resources by extracting request_id from packed data and matching against pending requests. 3. per_transfer_limit=0 sent as uint8 caused Python server to reject all messages (0 KB limit). Fixed to send msgpack nil for "no limit". Also adds parse_response_array() for type-agnostic response parsing, refactors sync into process_sync() state machine, adds NVS persistence for propagation node selection and stamp costs, and cleans up Resource.cpp debug artifacts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Pack field keys as integers (not BIN) to match Python LXMF wire format - Use general ENCRYPTED_PACKET_MDU (391 bytes) for explicitly OPPORTUNISTIC messages instead of LoRa-specific threshold (159 bytes) - Fix unpack_from_bytes to deserialize field keys as integers - Fix validate_signature to repack field keys as integers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bytes({key_int}) matched Bytes(size_t capacity) instead of creating a
1-byte buffer, storing empty keys that fields_get() could never match.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
Why
Packed message handling still had edge cases that could break message loading and follow-on processing when expected packed payload state was missing or incomplete.
Impact
This makes the embedded messaging stack more defensive when handling packed LXMF payloads and reduces the chance of state corruption or crashes in those paths.
Validation
piois not installed.