@@ -6,7 +6,7 @@ DMA transfers.
6
6
The DMA peripheral is used to perform memory transfers in parallel to the work
7
7
of the processor (the execution of the main program). A DMA transfer is more or
8
8
less equivalent to spawning a thread (see [ ` thread::spawn ` ] ) to do a ` memcpy ` .
9
- We'll use the fork-join model to illustrate the requirements of a memory safe
9
+ We'll use the fork-join model to illustrate the requirements of a memory- safe
10
10
API.
11
11
12
12
[ `thread::spawn` ] : https://doc.rust-lang.org/std/thread/fn.spawn.html
@@ -28,11 +28,11 @@ Assume that the `Dma1Channel1` is statically configured to work with serial port
28
28
{{#include .. / ci / dma / src / lib . rs: 82 : 83 }}
29
29
```
30
30
31
- Let's say we want to extend ` Serial1 ` API to (a) asynchronously send out a
31
+ Let's say we want to extend the ` Serial1 ` API to (a) asynchronously send out a
32
32
buffer and (b) asynchronously fill a buffer.
33
33
34
- We'll start with a memory unsafe API and we'll iterate on it until it's
35
- completely memory safe. On each step we'll show you how the API can be broken to
34
+ We'll start with a memory- unsafe API and we'll iterate on it until it's
35
+ completely memory- safe. At each step we'll show you how the API can be broken to
36
36
make you aware of the issues that need to be addressed when dealing with
37
37
asynchronous memory operations.
38
38
@@ -47,7 +47,7 @@ keep things simple let's ignore all error handling.
47
47
{{#include .. / ci / dma / examples / one . rs: 7 : 47 }}
48
48
```
49
49
50
- > ** NOTE:** ` Transfer ` could expose a futures or generator based API instead of
50
+ > ** NOTE:** ` Transfer ` could expose a futures- or generator- based API instead of
51
51
> the API shown above. That's an API design question that has little bearing on
52
52
> the memory safety of the overall API so we won't delve into it in this text.
53
53
@@ -95,11 +95,13 @@ variables `x` and `y` changing their value at random times. The DMA transfer
95
95
could also overwrite the state (e.g. link register) pushed onto the stack by the
96
96
prologue of function ` bar ` .
97
97
98
- Note that if we had not use ` mem::forget ` , but ` mem::drop ` , it would have been
99
- possible to make ` Transfer ` 's destructor stop the DMA transfer and then the
100
- program would have been safe. But one can * not* rely on destructors running to
101
- enforce memory safety because ` mem::forget ` and memory leaks (see RC cycles) are
102
- safe in Rust.
98
+ Note that if we had used ` mem::drop ` instead of ` mem::forget ` , it would have
99
+ been possible to make ` Transfer ` 's destructor stop the DMA transfer and then the
100
+ program would have been safe. But one * cannot* rely on destructors running to
101
+ enforce memory safety because ` mem::forget ` and memory leaks (see ` Rc ` cycles)
102
+ are safe in Rust. (Refer to [ ` mem::forget ` safety] .)
103
+
104
+ [ `mem::forget` safety ] : https://doc.rust-lang.org/std/mem/fn.forget.html#safety
103
105
104
106
We can fix this particular problem by changing the lifetime of the buffer from
105
107
` 'a ` to ` 'static ` in both APIs.
@@ -164,7 +166,7 @@ result in a data race: both the processor and the DMA would end up modifying
164
166
` buf ` at the same time. Similarly the compiler can move the zeroing operation to
165
167
after ` read_exact ` , which would also result in a data race.
166
168
167
- To prevent these problematic reorderings we can use a [ ` compiler_fence ` ]
169
+ To prevent these problematic reorderings we can use a [ ` compiler_fence ` ] .
168
170
169
171
[ `compiler_fence` ] : https://doc.rust-lang.org/core/sync/atomic/fn.compiler_fence.html
170
172
@@ -188,29 +190,29 @@ orderings in the comments.
188
190
{{#include .. / ci / dma / examples / four . rs: 68 : 87 }}
189
191
```
190
192
191
- The zeroing operation can * not * be moved * after* ` read_exact ` due to the
192
- ` Release ` fence. Similarly, the ` reverse ` operation can * not * be moved * before*
193
+ The zeroing operation * cannot * be moved * after* ` read_exact ` due to the
194
+ ` Release ` fence. Similarly, the ` reverse ` operation * cannot * be moved * before*
193
195
` wait ` due to the ` Acquire ` fence. The memory operations * between* both fences
194
196
* can* be freely reordered across the fences but none of those operations
195
197
involves ` buf ` so such reorderings do * not* result in undefined behavior.
196
198
197
199
Note that ` compiler_fence ` is a bit stronger than what's required. For example,
198
200
the fences will prevent the operations on ` x ` from being merged even though we
199
201
know that ` buf ` doesn't overlap with ` x ` (due to Rust aliasing rules). However,
200
- there exist no intrinsic that's more fine grained than ` compiler_fence ` .
202
+ there exists no intrinsic that's more fine grained than ` compiler_fence ` .
201
203
202
204
### Don't we need a memory barrier?
203
205
204
206
That depends on the target architecture. In the case of Cortex M0 to M4F cores,
205
207
[ AN321] says:
206
208
207
- [ AN321 ] : https://static.docs. arm.com/dai0321/a/DAI0321A_programming_guide_memory_barriers_for_m_profile.pdf
209
+ [ AN321 ] : https://documentation-service. arm.com/static/5efefb97dbdee951c1cd5aaf
208
210
209
211
> 3.2 Typical usages
210
212
>
211
213
> (..)
212
214
>
213
- > The use of DMB is rarely needed in Cortex-M processors because they do not
215
+ > The use of ` DMB ` is rarely needed in Cortex-M processors because they do not
214
216
> reorder memory transactions. However, it is needed if the software is to be
215
217
> reused on other ARM processors, especially multi-master systems. For example:
216
218
>
@@ -223,25 +225,26 @@ That depends on the target architecture. In the case of Cortex M0 to M4F cores,
223
225
>
224
226
> (..)
225
227
>
226
- > Omitting the DMB or DSB instruction in the examples in Figure 41 on page 47
227
- > and Figure 42 would not cause any error because the Cortex-M processors:
228
+ > Omitting the ` DMB ` or ` DSB ` instruction in the examples in Figure 41 on page
229
+ > 47 and Figure 42 would not cause any error because the Cortex-M processors:
228
230
>
229
231
> - do not re-order memory transfers
230
232
> - do not permit two write transfers to be overlapped.
231
233
232
- Where Figure 41 shows a DMB (memory barrier) instruction being used before
234
+ Where Figure 41 shows a ` DMB ` (memory barrier) instruction being used before
233
235
starting a DMA transaction.
234
236
235
- In the case of Cortex-M7 cores you'll need memory barriers (DMB/ DSB) if you are
236
- using the data cache (DCache), unless you manually invalidate the buffer used by
237
- the DMA. Even with the data cache disabled, memory barriers might still be
238
- required to avoid reordering in the store buffer.
237
+ In the case of Cortex-M7 cores you'll need memory barriers (` DMB ` / ` DSB ` ) if you
238
+ are using the data cache (DCache), unless you manually invalidate the buffer
239
+ used by the DMA. Even with the data cache disabled, memory barriers might still
240
+ be required to avoid reordering in the store buffer.
239
241
240
242
If your target is a multi-core system then it's very likely that you'll need
241
243
memory barriers.
242
244
243
245
If you do need the memory barrier then you need to use [ ` atomic::fence ` ] instead
244
- of ` compiler_fence ` . That should generate a DMB instruction on Cortex-M devices.
246
+ of ` compiler_fence ` . That should generate a ` DMB ` instruction on Cortex-M
247
+ devices.
245
248
246
249
[ `atomic::fence` ] : https://doc.rust-lang.org/core/sync/atomic/fn.fence.html
247
250
@@ -282,7 +285,7 @@ pointer used in `read_exact` will become invalidated. You'll end up with a
282
285
situation similar to the [ ` unsound ` ] ( #dealing-with-memforget ) example.
283
286
284
287
To avoid this problem we require that the buffer used with our API retains its
285
- memory location even when it's moved. The [ ` Pin ` ] newtype provides such
288
+ memory location even when it's moved. The [ ` Pin ` ] newtype provides such a
286
289
guarantee. We can update our API to required that all buffers are "pinned"
287
290
first.
288
291
@@ -347,7 +350,7 @@ over. For example, dropping a `Transfer<Box<[u8]>>` value will cause the buffer
347
350
to be deallocated. This can result in undefined behavior if the transfer is
348
351
still in progress as the DMA would end up writing to deallocated memory.
349
352
350
- In such scenario one option is to make ` Transfer.drop ` stop the DMA transfer.
353
+ In such a scenario one option is to make ` Transfer.drop ` stop the DMA transfer.
351
354
The other option is to make ` Transfer.drop ` wait for the transfer to finish.
352
355
We'll pick the former option as it's cheaper.
353
356
@@ -365,8 +368,8 @@ Now the DMA transfer will be stopped before the buffer is deallocated.
365
368
366
369
## Summary
367
370
368
- To sum it up, we need to consider all the following points to achieve memory
369
- safe DMA transfers:
371
+ To sum it up, we need to consider all the following points to achieve
372
+ memory- safe DMA transfers:
370
373
371
374
- Use immovable buffers plus indirection: ` Pin<B> ` . Alternatively, you can use
372
375
the ` StableDeref ` trait.
@@ -381,8 +384,8 @@ safe DMA transfers:
381
384
382
385
---
383
386
384
- This text leaves out up several details required to build a production grade
385
- DMA abstraction, like configuring the DMA channels (e.g. streams, circular vs
387
+ This text leaves out several details required to build a production- grade DMA
388
+ abstraction, like configuring the DMA channels (e.g. streams, circular vs
386
389
one-shot mode, etc.), alignment of buffers, error handling, how to make the
387
390
abstraction device-agnostic, etc. All those aspects are left as an exercise for
388
391
the reader / community (` :P ` ).
0 commit comments