-
Notifications
You must be signed in to change notification settings - Fork 726
layer: clarify attributes for implied directories #970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -61,6 +61,20 @@ Where supported, MUST include file attributes for Additions and Modifications in | |
|
||
[Sparse files](https://en.wikipedia.org/wiki/Sparse_file) SHOULD NOT be used because they lack consistent support across tar implementations. | ||
|
||
#### Implied Directories | ||
|
||
As the tar format describes directory hierarchies using a flat datastructure, it is possible to have so-called "implied directories" where not all parent directories implied by an entries' path in the archive have their own entry. | ||
|
||
When applying a layer, implementations MUST create any parent directories implied by an entries' path, even if it is otherwise absent from the archive. Attributes of the created parent directories MUST be set as follows: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this first MUST is uncontroversial, but the second MUST is potentially sticky and perhaps needs to be SHOULD? To state that another way, are we confident that all existing implementations are currently complying with this second MUST? (I'm reasonably confident they are complying with the first one, because it's kind of unavoidable.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Friendly ping @neersighted 🙇 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that the attribute list needs to be a SHOULD, if only because requiring empty |
||
|
||
* `mtime` is set to the Unix epoch (`0`) | ||
* `uid` is set to the `0` | ||
* `gid` is set to the `0` | ||
* `mode` is set to `0755` | ||
* `xattrs` are empty | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This cannot be MUST because some xattrs are auto-set by the kernel or need to be set in order for the system to work properly (SELinux labels for instance). I agree with @tianon that this should be a SHOULD. |
||
|
||
Layer authors SHOULD ensure directory entries are fully present for all directory hierarchies in their layers, as previous versions of this specification did not specify this behavior and results may be implementation defined. | ||
|
||
#### Hardlinks | ||
|
||
* Hardlinks are a [POSIX concept](https://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html) for having one or more directory entries for the same file on the same device. | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note it's more subtle than this, as there's actually two different cases:
It happens to be today that popular runtimes like docker/podman unpack layers literally to disk and then just stitch them together with
overlayfs
at runtime - and because of this implementation they must create such missing parent directories.But that's just one implementation model, and not the only one. We're debating this in containers/composefs-rs#132 but basically an entirely different way to do this is to have an object store, and instead of creating an overlayfs stack, one uses reflinks (or composefs) to pre-compute the final merged filesystem tree in a single directory. (Or pre-compute multiple squashed ones, it's flexible)
I would actually like the spec to make it possible for a conforming implementation to error on these "implied directories" in the case where they're actually missing completely in the final image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for the case where they are missing, for implementations that want to accept such images (for compatibility, which makes sense), it does make sense to me to specify here a recommendation for what implementations should do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on why you'd like that to be an error? As long as the "implied" directories end up with reasonable metadata, they seem to me like an interesting way to enable more layer reuse.
If you're pre-constructing the whole tree, surely you can pre-compute the implied directories too? (I'm certain I'm missing something here 🙏❤️)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see the value to a runtime erroring if it prioritizes security, or preventing undefined behavior, over compatibility. Without erroring, a layer could accidentally change permissions set in a parent layer. For example, a parent layer could configure a directory without world read access, limited to a specific group. And some implementations would change the permissions to root:root/0755 if the directory was not defined in the new layer.
It's not safe to look at the previous layers when unpacking a new layer since layers are not tightly attached to a single image. Implementations that change based on the existing image layers could be vulnerable an attack by getting a victim to unpack a malicious image first that defaults the directory permissions to a more permissive value, and then pulling the target image with the layer already cached with the malicious parent directory permissions.