Skip to content

Linux LE L2CAP transport for V5X printers (fixes #23)#24

Closed
d-roman-halliday wants to merge 5 commits into
Dejniel:masterfrom
d-roman-halliday:feature/linux-l2cap-le-transport
Closed

Linux LE L2CAP transport for V5X printers (fixes #23)#24
d-roman-halliday wants to merge 5 commits into
Dejniel:masterfrom
d-roman-halliday:feature/linux-l2cap-le-transport

Conversation

@d-roman-halliday
Copy link
Copy Markdown
Contributor

Fixes #23.

On Linux + BlueZ, BleakBluetoothConnector.connect() (and anything else that goes through org.bluez.Device1.Connect()) hangs and then times out for V5X-family printers whose advertising packet sets the BR/EDR flag in addition to LE. BlueZ believes the dual-mode flag and pages on Classic first; the printer doesn't actually speak Classic, so we get Page Timeout. BlueZ never retries on LE, and the DBus call hangs until the client-side timeout. Reproducible with plain bluetoothctl connect — the cause is entirely in bluetoothd's transport selection, not in TiMini-Print, Bleak, or Python.

This PR adds a direct LE L2CAP transport that bypasses Device1.Connect() the way gatttool does. The high-level transport plumbing (split bulk writes, runtime controllers, endpoint resolution, chunked GATT writes, notification dispatch) is reused — only the lowest layer changes.

What's in the PR

  • linux_l2cap_client.pyLinuxLeL2capClient mirrors the subset of BleakClient that _BleakTransportSession consumes: services, mtu_size, write_gatt_char, start_notify, stop_notify, disconnect. Backed by socket.AF_BLUETOOTH + BTPROTO_L2CAP on CID 4 (ATT), with sockaddr_l2 built via ctypes because Python's stdlib socket module doesn't expose the LE-public tuple form on all builds. Implements ATT MTU exchange, primary-service / characteristic / descriptor discovery, Write Cmd / Write Req, CCCD subscribe, and a background reader thread that hops notification callbacks onto the asyncio loop via call_soon_threadsafe so existing asyncio.Event-based runtime state stays consistent.
  • linux_l2cap_adapter.py_LinuxL2capLeSocket subclasses _BleakSocket and overrides only _connect_async to construct a LinuxLeL2capClient instead of a BleakClient. Everything else in the existing transport pipeline works unchanged.
  • adapters/__init__.py — backend selection respects TIMINI_BLE_BACKEND={auto,bleak,l2cap}. On Linux, auto (the default) prefers L2CAP when socket.AF_BLUETOOTH is available; falls back to Bleak everywhere else.
  • V5X runtime tolerance for missing 0xAA (printing/runtime/v5x.py) — on some V5X firmwares (observed: MXW01 v1.9.3.1.2) the 0xAA notification is an end-of-print status, not a between-commands start-ready signal. The runtime now logs and continues if 0xAA doesn't arrive in time, rather than aborting the print.
  • Quiescence-based disconnect in LinuxLeL2capClient.disconnect — close once the printer has been silent for _DISCONNECT_IDLE_SECONDS (1.5 s by default, hard-capped at 30 s). Without this, the runtime tears the L2CAP socket down before the printer finishes rendering buffered data and the tail of the print gets cut. Both knobs are env-tunable: TIMINI_BLE_DISCONNECT_IDLE_S, TIMINI_BLE_DISCONNECT_MAX_S.
  • Env override for bulk write rate (TIMINI_BLE_BULK_DELAY_MS, in bleak_adapter_transport.py) — overrides the V5X profile's bulk_write_delay_ms. The current 10 ms default works for clones whose firmware emits the flow-control pause patterns the profile lists; some V5X variants never send them, so the bytes past the on-device buffer get dropped on long jobs. Per-host env override lets each user tune for their printer without changing the upstream profile.
  • 21 new unit tests covering the pure helpers (BD-address packing, sockaddr_l2 layout, UUID conversions, property-bit decoding, characteristic-collection lookups, value-handle resolution, env-var parsing). All 351 tests pass.
  • .gitignore updated for .idea/ and .venv/.

Validation

Built and tested end-to-end against a real MXW01 (Ubuntu 25.10, kernel 6.17, BlueZ 5.83, Python 3.12 rebuilt with libbluetooth-dev).

Through unmodified timiniprint_command_line.py:

Job Result
Short text (--text "Test ...") ✅ prints
Small JPEG (~100 raster rows) ✅ prints
Long text (~308 rows) ⚠️ prints up to ~60–75% and truncates cleanly mid-row, with no host-visible error. Printer ACKs the finalize packet, so it received the full payload — the truncation is a firmware per-job ceiling on this MXW01 (filed as a separate issue for tracking pagination).
Tall image (~436 rows) ⚠️ same firmware ceiling — separate issue.

The transport itself is correct; long-content pagination is a host-side rendering concern outside the scope of this PR.

Reproducing the original bug

Minimal pre-fix reproducer, on Linux + BlueZ with a V5X-family printer whose advertising sets the dual-mode flag:

# Bleak 3.x in a venv
from bleak import BleakClient, BleakScanner
import asyncio, time

ADDR = "XX:XX:XX:XX:XX:XX"  # V5X printer

async def main():
    dev = await BleakScanner.find_device_by_address(ADDR, timeout=15)
    if not dev:
        print("not advertising"); return
    t0 = time.monotonic()
    try:
        await BleakClient(dev).connect(timeout=45)
        print(f"connected in {time.monotonic()-t0:.1f}s")
    except Exception as exc:
        print(f"failed after {time.monotonic()-t0:.1f}s: {exc!r}")

asyncio.run(main())

Expected on an affected host: failed after 45.0s: TimeoutError().

With the PR applied and TIMINI_BLE_BACKEND=l2cap (or auto on Linux), the same printer prints normally through the standard CLI:

TIMINI_BLE_BACKEND=l2cap python3 timiniprint_command_line.py --bluetooth MXW01 --text "hello"

Notes

  • Non-Linux platforms are unaffected — _get_ble_adapter() returns the Bleak adapter as before.
  • TIMINI_BLE_BACKEND=bleak forces the legacy path on Linux for users whose printer worked fine before.
  • The new client uses ctypes only for bind() / connect() of sockaddr_l2; everything else uses Python's socket API.
  • Quiescence-based disconnect avoids the need for hard-coded sleeps after writes.

🤖 Generated with Claude Code

On Linux + BlueZ, BleakBluetoothConnector → org.bluez.Device1.Connect()
hangs for V5X-family printers whose advertising packet sets the dual-mode
flag (Simultaneous LE and BR/EDR). BlueZ pages on BR/EDR first, gets
Page Timeout, and never retries on LE. The printer and the V5X protocol
are fine — only the transport selection is broken. Reproducible with
plain `bluetoothctl connect`, so it is not a Bleak or Python concern.

This commit adds a direct LE L2CAP transport that bypasses
bluetoothd's Device1.Connect() entirely, the same way gatttool does:

- timiniprint/transport/bluetooth/adapters/linux_l2cap_client.py:
  LinuxLeL2capClient mirrors the subset of BleakClient that
  _BleakTransportSession consumes (services, mtu_size, write_gatt_char,
  start_notify, stop_notify, disconnect). Backed by socket.AF_BLUETOOTH
  + BTPROTO_L2CAP on CID 4 (ATT), with sockaddr_l2 built via ctypes
  because Python's stdlib socket module doesn't expose the LE-public
  tuple form on all builds. Implements ATT MTU exchange, primary service
  discovery (Read By Group Type), characteristic discovery (Read By
  Type), descriptor discovery (Find Information), Write Cmd / Write Req,
  CCCD subscribe, and a background reader thread that hops notification
  callbacks onto the asyncio loop via call_soon_threadsafe so existing
  asyncio.Event-based runtime state stays consistent.

- timiniprint/transport/bluetooth/adapters/linux_l2cap_adapter.py:
  _LinuxL2capLeSocket subclasses _BleakSocket and overrides
  _connect_async to construct a LinuxLeL2capClient instead of a
  BleakClient. All existing transport-session plumbing (split bulk
  writes, runtime controllers, endpoint resolution, chunked GATT writes,
  notification dispatch) is reused unchanged.

- timiniprint/transport/bluetooth/adapters/__init__.py: backend selection
  respects TIMINI_BLE_BACKEND env var (auto / bleak / l2cap). On Linux,
  auto prefers L2CAP when socket.AF_BLUETOOTH is available; falls back
  to Bleak otherwise and on non-Linux.

- timiniprint/printing/runtime/v5x.py: tolerate missing 0xAA start-ready
  notification. On some V5X firmwares (observed: MXW01 v1.9.3.1.2) the
  0xAA notification arrives as an end-of-print status rather than a
  between-commands start-ready signal; treating the wait as advisory
  (log and continue) unblocks printing on those firmwares without
  changing semantics for others.

- tests/test_linux_l2cap_client.py: unit tests for the pure helpers
  (BD address packing, sockaddr_l2 layout, UUID conversions, property
  bit decoding, characteristic collection lookups, value-handle
  resolution). 16 new tests; all 346 tests pass.

Validation: text and image print jobs complete end-to-end on a real
MXW01 (Ubuntu, BlueZ 5.83, kernel 6.17) when invoked through
timiniprint_command_line.py with TIMINI_BLE_BACKEND=l2cap or default
auto. The Bleak path is unchanged on non-Linux platforms.

See issue Dejniel#23 for the btmon HCI capture confirming the root cause.
V5X-family firmwares (observed: MXW01 v1.9.3.1.2) buffer the print job
and render asynchronously from that buffer. The runtime CLI tears the
L2CAP socket down immediately after the last write returns, which on
BLE aborts any data still in the printer's buffer. The stop-gap shim
that printed successfully earlier already had a multi-second drain
window before close; the L2CAP backend needs the same.

Without this, small jobs that fit in the kernel's BLE send queue end
to-end (e.g. the original "L2CAP backend works" smoke test, payload
~1100 bytes) finished in time, but jobs that took longer to drain
(e.g. the 5328-byte EMX_040256.jpg) had their tail discarded — the
verbose trace showed every expected notification and no errors, yet
nothing printed.

A 2-second hold is more than enough for the small jobs we tested.
Long-running jobs that exceed the printer's internal buffer are a
separate issue (no flow-control notifications arrive on MXW01, so
sustained 18 KB/s saturates the buffer); that fix lands in a
follow-up commit.
Two follow-up fixes on top of the disconnect-grace commit:

- TIMINI_BLE_BULK_DELAY_MS overrides the V5X profile's
  bulk_write_delay_ms at runtime. The V5X profile defaults to 10 ms
  per 180-byte chunk (≈18 KB/s) which works for clones that emit the
  flow-control pause notifications the profile defines. Some V5X
  firmwares (observed: MXW01 v1.9.3.1.2) never send those patterns,
  so bytes past the on-device buffer get dropped on long jobs.
  Per-host override lets each user tune the delay to their printer's
  render rate without changing the upstream profile defaults.

- LinuxLeL2capClient now waits for the printer to go quiet before
  closing the L2CAP socket (default: close once it has been silent
  for 1.5 s, hard cap 30 s). Tunable via TIMINI_BLE_DISCONNECT_IDLE_S
  and TIMINI_BLE_DISCONNECT_MAX_S. Replaces the earlier fixed 2 s
  grace, which was enough for small jobs but not for ones where
  rendering kept going for several seconds after the last byte.

Includes unit tests for the env-var parsing.

Note: empirical testing on MXW01 shows the firmware has a per-job
row-count ceiling (~200–300 rows of 384-px-wide raster) regardless
of how we pace the transport. Long Lorem ipsum prints truncate
cleanly mid-row even when 100% of bytes are received and the
finalize is acked. Per-job splitting is host-side rendering work
that lives outside this transport change.
Common local-development clutter that doesn't belong in the repo. Lines
sorted alphabetically alongside the existing entries.
Two related fixes aimed at issue Dejniel#25 (V5X per-job row ceiling on
MXW01 v1.9.3.1.2 and likely similar firmwares).

1. **Pagination in PrintJobBuilder.**
   New env var `TIMINI_PRINT_MAX_JOB_ROWS` (default 0 = no split). When
   set, any rendered page raster taller than the limit is split
   vertically into sub-rasters and built as separate V5X sessions
   (`A7 / A2 / A9 / bulk / AD`). Each sub-session becomes a
   `ProtocolStep.SEND` carrying the full session bytes so the existing
   step-driven send loop paces them across one connection. Uses
   `RasterBuffer.slice_rows` which already supports the split.

2. **Tolerant runtime ACK waits in `printing/runtime/v5x.py`.**
   `_wait_for_start_ready` already logged-and-continued on missing
   `0xAA`; `after_split_command` did not. With pagination on, the
   second sub-job's `0xA7` ACK is preemptively consumed by a
   between-segments `0xA6` idle re-identification — so seg2's ACK
   never arrives and the wait raised an empty `asyncio.TimeoutError`
   that bubbled out as "BLE write failed:" with no detail. Now caught
   and logged the same way; the handshake state is still cleaned up
   so the session continues correctly.

Tests:
- `tests/test_builder_pagination.py` — 10 new tests covering the
  env-var parsing and `_split_raster_for_max_rows`.
- Updated `tests/test_bleak_transport_session.py
  test_v5x_timeout_clears_pending_handshake_state` to reflect the
  new "log + continue" behaviour rather than the old "raise". Same
  test still asserts the handshake state gets cleared.
- All 361 tests pass.

Validation: with `TIMINI_PRINT_MAX_JOB_ROWS=200` and
`TIMINI_BLE_BACKEND=l2cap` the long Lorem ipsum now drives the runtime
through both sub-jobs without the empty `TimeoutError` failure mode
that previously aborted the second segment. The printer reports clean
status (`0xAA` payload first byte `0x00` rather than `0xfc`).

Residual firmware constraint: the physical print is still truncated
at the same row count regardless of pagination, suggesting the MXW01
v1.9.3.1.2 has a *per-power-cycle* row budget rather than a per-job
one. That is hardware behaviour we cannot work around from the host;
documented in Dejniel#25 for further investigation.
@Dejniel
Copy link
Copy Markdown
Owner

Dejniel commented May 13, 2026

@d-roman-halliday thanks for the PR and for the very detailed hardware work. The btmon trace and the L2CAP proof were especially useful. They confirmed that the #23 root cause is below TiMini-Print's classic/BLE fallback layer: for these devices we already choose BLE, but BlueZ Device1.Connect() can still choose BR/EDR internally when the advertising flags claim dual-mode support.

I do not want to merge this PR as-is, but not because the work is bad. The issue is that it combines two different layers of changes:

  • a Linux/BlueZ LE bearer workaround
  • several V5X/MXW01 runtime and firmware-behaviour workarounds

The first one is the actual #23 fix. I pushed a smaller version of that to master: Linux BLE now tries an isolated direct ATT/L2CAP path first and falls back to the existing Bleak path if that fails. Classic/SPP and non-Linux BLE stay unchanged. I kept this intentionally narrow because this is a transport/backend problem, not a V5X protocol change.

The V5X changes in this PR need separate handling. In the source app, the V5X flow treats AA as the ready gate before the A2 / A9 stage, A7 as the serial/info reply, and A9 as start-print status. The current repo follows that model. Your MXW01 1.9.3.1.2 observations are valuable, but turning those waits into globally advisory V5X behaviour would change the whole family based on one firmware variant. I would rather model that explicitly if we need it: as a firmware/profile/runtime capability, or as a separate documented workaround, not as a broad V5X relaxation.

Same for the other parts:

So my plan is:

I really appreciate the analysis and the hardware validation here. I just want to keep the repo architecture explicit instead of turning V5X into a set of timing/workaround hooks.

Dejniel pushed a commit that referenced this pull request May 13, 2026
@Dejniel
Copy link
Copy Markdown
Owner

Dejniel commented May 13, 2026

@d-roman-halliday I added one more V5X change on top of the Linux BLE fix.

The V5X print flow now waits for matched BLE notifications instead of depending on timing:

  • A7 command ACK
  • AA start-ready
  • A9 start-print status
  • optional B1 connect info

So if your firmware was replying a little differently or slower, this should avoid sending the next command too early. This is still kept in the V5X runtime controller, not in the transport layer.

Please test current master again on your MXW01 and let me know if this changes the behavior.

@Dejniel
Copy link
Copy Markdown
Owner

Dejniel commented May 14, 2026

I pushed b8d38a5 to master with another V5X BLE transport pass.

It keeps the useful chunking/pacing direction from this PR, but in the existing architecture: explicit BLE bulk-write profile, V5X bulk pacing at 30ms, MTU - 8-style payload margin, and conservative 20-byte fallback until the real MTU/characteristic limit is known.

This should cover the transport timing/chunking side without merging the PR as-is.

@Dejniel
Copy link
Copy Markdown
Owner

Dejniel commented May 20, 2026

Closing this as superseded by the current master Linux ATT path and the smaller wiring fix cherry-picked from #26.

The remaining V5X/MXW01 behavior is being kept separate from the Linux/BlueZ connect fix.

@Dejniel Dejniel closed this May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Linux/BlueZ: connect hangs on V5X printers — bluetoothd picks BR/EDR over LE due to dual-mode adv flag

2 participants