Skip to content

Conversation

@martinling
Copy link
Member

@martinling martinling commented Oct 20, 2025

This PR attempts to implement the buffer scheme described in #1363 (comment).

The M0 SGPIO code continues to use the existing 32KB sample buffer, which has special placement.

An additional 32KB USB buffer is placed in ram_local1, above the .text section.

The GPDMA controller is used to transfer samples between the sample buffer and USB buffer.

The initial commits in this PR set up the new buffer scheme, whilst using memcpy running synchronously on the M4 core as a placeholder for the DMA operations. This works correctly up to around 8Msps.

The final commit switches to the DMA implementation.

@martinling martinling force-pushed the extra-buffer branch 5 times, most recently from d863307 to 8c54161 Compare October 21, 2025 14:22
@martinling martinling force-pushed the extra-buffer branch 2 times, most recently from be432d4 to 912d993 Compare November 11, 2025 17:07
@martinling martinling marked this pull request as ready for review November 11, 2025 17:08
@mossmann mossmann self-requested a review November 11, 2025 17:13
@martinling martinling linked an issue Nov 11, 2025 that may be closed by this pull request
@mossmann
Copy link
Member

On my Linux test host, this improves maximum shortfall-free RX sample rate from 19.2 Msps to 21.8 Msps and improves TX from 20.1 Msps to 21.8 Msps.

Copy link
Member

@mossmann mossmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't spotted the bug, but I'm seeing short TX with hackrf_transfer -n truncated by a few milliseconds. Looks like the transmissions are about 9000 samples short.

@mossmann
Copy link
Member

Transmission duration does not change linearly with the number of samples.

Testing with hackrf_transfer -a 1 -x 31 -c 127 -f 2500000000 -n 32768 -s 2000000

32768 samples: 8.88 ms (expected 16.38 ms)
32769 samples: 12.29 ms (expected 16.38 ms)
33000 samples: 12.41 ms (expected 16.50 ms)
40000 samples: 12.93 ms (expected 20.00 ms)
40960 samples: 13.03 ms (expected 20.48 ms)
40961 samples: 16.85 ms(expected 20.48 ms)

@martinling
Copy link
Member Author

I believe this is due to the current behaviour of hackrf_enable_tx_flush.

When that function is used, the host will send 32KB of zeroes after the end of the provided TX data, before making the control request to switch transceiver mode back to idle.

The theory behind that approach was that, since there's only 32KB of space to store samples, once you've sent that many zeroes and the device has accepted them, the earlier samples you care about must have been transmitted.

With this PR, we double the buffer space, so the host would now need to send 64KB of zeroes to achieve the same. If you change DEVICE_BUFFER_SIZE in libhackrf/src/hackrf.c to 65536, I think that will stop the behaviour you're seeing.

Unfortunately, we hardcoded that figure rather than adding a vendor request to query it from the device. My recollection is that we discussed that decision verbally at the time, but since I'd already made some attempts to increase the buffer size without success, we thought that 32KB figure wasn't ever going to change and it wasn't worth adding this query.

Our options are:

  • Bump the DEVICE_BUFFER_SIZE in libhackrf. This is no good, because users will get short TX if running older software with newer firmware, just as you're seeing now.

  • Add a vendor request to query the buffer size, defaulting to 32KB if the request is not supported. Same problem, because older software won't make the request. We would have needed to add this at the same time we added hackrf_enable_tx_flush.

  • Make the firmware emulate the previous behaviour, as follows: Once the host has sent the mode change request to leave TX mode, stop making transfers into the USB buffer, but move all existing data from the USB buffer into the sample buffer before actually shutting down TX.

I'll implement the latter.

@martinling
Copy link
Member Author

Ensuring that all data in the USB buffer is moved to the sample buffer before stopping TX has fixed the non-linearity you were seeing, but I'm now seeing timing coming up short by what looks like a consistent 4.096ms, i.e. 8KB of samples or one DMA transfer. I'll see about tracking down the remaining bug.

@martinling martinling force-pushed the extra-buffer branch 3 times, most recently from ffc5ad4 to 33bf9ad Compare November 17, 2025 16:56
@martinling
Copy link
Member Author

Some experimentation with looking at ramped signals on a scope shows that the transmission includes the first 32KB preloaded into the sample buffer; is missing the next 8KB, which is the first to be transferred by DMA; then all subsequent samples are transmitted correctly. This is true from the memcpy version onwards, regardless of whether DMA_TRANSFER_SIZE is 16KB or 8KB.

@martinling
Copy link
Member Author

Fixed that - still some issues with small sample counts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hackrf_transfer -c produces possible buffer underruns

2 participants