In Chapter 3.4.4 the code is shown for creating a first CNN. For using nn.Flattening before the last Layer, it says in the code comments (in the book it's point 10), "Converts from (B, C, W, H) ->(B, D) so we can use a Linear layer".
Shouldn't it actually be (B, filters, C, W, H) -> (B, filters*D) ?