Skip to content

Conversation

197g
Copy link
Member

@197g 197g commented Aug 18, 2025

Closes: #2177
Closes: #1890

I've also added the change to having some limits in decoder by default to the tracking list for a 1.0 version, we've only avoided this for compatibility concerns. Should be fine in a thought out and document major version bump.

@197g 197g merged commit add9be2 into main Aug 18, 2025
32 checks passed
@197g 197g deleted the documentation branch August 18, 2025 11:05
@fintelia
Copy link
Contributor

One additional nuance is that while while {Png, Jpeg, ...}Decoder::new doesn't set default limits, ImageReader::new does. Using the defaults everywhere would certainly make things simpler.

Though longer term I'd also like to reduce the number of places that decoders do potentially large allocations at all. Memory limits are a bunch bigger deal when the decoder might make a massive allocation as a scratch buffer or to decode pixel data into. Like theoretically the PNG decoder should only have to care about limits for the exif_metadata and icc_profile methods, or in the extremely unlikely case the image is millions of pixels wide. Everything else can be stored in at most a few hundred KB. There's no reason that the decoder should be bothering with memory limit tracking when decoding some metadata chunk that the spec defines to be at most, say, 12 bytes.

@Shnatsel
Copy link
Member

After doing a bunch of work on supporting memory limits in image back in the day I came to the conclusion that they should be entirely opt-in.

  1. They only hinder user-facing applications like image viewers. It's easy enough to opt in for server programs that do need them.
  2. It's a hidden sharp edge for beginners. A naive program appears to work fine until it hits an unusually large image, at which point it breaks.
  3. There is no such thing as a "correct" default limit. Any default limits we set are purely guesswork and will likely need to be changed based on the application's requirements anyway.
  4. Even on the server, everything is increasingly containerized and memory limits are enforced externally through cgroups, etc., making client-side memory limits even less common and useful.

@fintelia
Copy link
Contributor

To me, the two main use cases for memory limits are:

  • Handling intentionally malformed images where we don't want a 20-byte file to cause us to OOM trying to allocate 20 TB of memory. Even if the crash would be handled at a higher level, it is still nicer to return an error than have to fully restart.
  • Being able to fuzz decoders because otherwise the fuzzer just finds all the places that do unbounded allocations.

@197g
Copy link
Member Author

197g commented Aug 25, 2025

My point of view comes from web servers. Those, too, face the same clarity issues regarding default limits for body sizes (or stream sizes in Http3 in both directions). The interface is also similar: library themselves will internally mostly operate on files and streams abstractly and with some fixed (OS-allocate) buffer size but they also offer some operation to materialize such streams into unbounded representations when asked. It has been demonstrated again, and again, that having no limit at all is a big security risk–and thus also a pitfall for users—and having some limit that is adjustable is always just a small matter, mostly configuring it once. Particularly a library can just defer to the application to provide such limits for all libraries it uses.

I don't want my image viewer to take up all of host or GPU memory, even if that in theory may be able to show the image. But it will crash the window manager which renders that pointless. Stability trumps completeness even the purely local use cases in my opinion. Generally, users do not segment individual applications in separate Linux PID slices and GPUs generally don't even provide great control here.

Also, we can choose the limits to avoid impacts for most users without compromosing its main purpose. Using >160Mpx is still highly constrained in terms of DOS but does not usually occur outside professional applications–which can just setup the limits. Those are the costs of doing things with guarantees.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add list/description of crate features to readme Document that not calling decoder.set_limits() means no limits, not default limits
3 participants