-
Notifications
You must be signed in to change notification settings - Fork 393
Description
Bug Description
Ambient capabilities are not applied as expected
The root cause lies in the implementation of drop_privileges for ambient capabilities.
https://github.com/youki-dev/youki/blob/main/crates/libcontainer/src/capabilities.rs#L155
if let Some(ambient) = cs.ambient() {
// check specifically for ambient, as those might not always be available
if let Err(e) = syscall.set_capability(CapSet::Ambient, &to_set(ambient)) {
tracing::warn!("failed to set ambient capabilities: {}", e);
}
}
In youki
Inside syscall.set_capability (as shown in the code above), it iterates over a HashSet—which does not guarantee a fixed order—to apply capabilities one by one.
If an error occurs partway through, execution returns from this code.
Because of that, the for loop does not always run to completion when an error happens, and some capabilities may end up not being applied.
In runc
Errors are handled inside the loop, so the for loop always runs to completion.
for _, a := range ambs {
err := capability.SetAmbient(true, a)
if err != nil {
logrus.Warnf("can't raise ambient capability %s: %v", capToStr(a), err)
}
}
Steps to Reproduce
- prepare config.json
We specify capabilities in the ambient set that are not included in the permitted and inheritable sets.
"args": [
"sh", "-c", "grep '^CapAmb' /proc/self/status"
],
...
"capabilities": {
"bounding": [
"CAP_NET_BIND_SERVICE",
"CAP_AUDIT_WRITE",
"CAP_KILL"
],
"effective": [
"CAP_NET_BIND_SERVICE",
"CAP_AUDIT_WRITE",
"CAP_KILL"
],
"inheritable": [
"CAP_NET_BIND_SERVICE",
"CAP_AUDIT_WRITE",
"CAP_KILL"
],
"permitted": [
"CAP_NET_BIND_SERVICE",
"CAP_AUDIT_WRITE",
"CAP_KILL"
],
"ambient": [
"CAP_NET_BIND_SERVICE",
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_SYSLOG" ←←←←←←←←←←← operation not permitted
]
},
- run youki multiple times
$ youki run -b tutorial/ container
CapAmb: 0000000020000420
$ youki run -b tutorial/ container
CapAmb: 0000000020000420
$ youki run -b tutorial/ container
CapAmb: 0000000000000000
$ youki run -b tutorial/ container
CapAmb: 0000000020000020
$ youki run -b tutorial/ container
CapAmb: 0000000020000420
If you run the following command, it becomes even clearer:
strace -f -e trace=prctl youki run -b tutorial/ container
Expectation
In the above case
since
CAP_NET_BIND_SERVICE: 0x0000000000000400
CAP_AUDIT_WRITE: 0x0000000020000000
CAP_KILL: 0x0000000000000020
$ youki run -b tutorial/ container
CapAmb: 0000000020000420
System and Setup Info
No response
Additional Context
related: #3210