sys/linux/test: add landlock_fs_disconnected #6164

l0kod · 2025-07-07T13:35:03Z

Test access through disconnected directory.

This test should trigger a warning without this patch: https://lore.kernel.org/all/[email protected]/

An ongoing kernel patch series will be applied to change handling of disconnected directories:
https://lore.kernel.org/all/[email protected]/

l0kod · 2025-07-07T13:36:20Z

This test triggers the WARN_ON_ONCE() when called with syz-execprog but not with syz-manager (which has been running for a while). Any idea why?

a-nogikh · 2025-07-07T14:40:05Z

Different parameters, probably? Do you set exactly the same -namespace, -procs, -enable (features)?

l0kod · 2025-07-07T16:22:36Z

Different parameters, probably? Do you set exactly the same -namespace, -procs, -enable (features)?

I guess you mean sandbox and procs? And yes, these options are the same: default.

I removed syscall filtering and don't have any enable feature.

a-nogikh · 2025-07-07T17:36:15Z

Yes, I meant -sandbox and not -namespace, sorry for the confusion.

If these are similar to what syz-manager used, then it's weird indeed. Does the nature of the WARN_ON_ONCE() suggest what may be the case?

One more way to spot the difference(s) would be to run both tools with the -debug flag and then compare the debug output for the executed seed program. Do the calls prior to the kernel crash have the same errno values?

Also, did you use the runtest mode of syz-manager or did you use it as a fuzzer?

l0kod · 2025-07-08T07:47:03Z

I didn't know about this run-tests mode, that helped, thanks.

I now see that syz-manager skips this test because of unsupported call renameat. I switched to renameat2 and it works!

However, this highlight an inconsistency between syz-manager and syz-execprog. Could that be a default syscall filter? Shouldn't syz-manager replace renameat with renameat2 instead of ignoring all tests with renameat?

l0kod · 2025-07-08T07:56:29Z

This test triggers a WARN_ON_ONCE() in current kernel code, but I'm working on a fix that should be merged in a few weeks. In fact, I wrote this test after we found the issue, in the hope that this will improve fuzzing coverage.

When this test will be merged in syzkaller, will syzbot report this WARN_ON_ONCE() issue? If that's the case, I guess we should wait for the fix to be merged first to avoid creating artificial syzbot report, right?

a-nogikh · 2025-07-08T09:01:13Z

Happy to hear that you have found the problem!

Is renameat deprecated on all arches? In our descriptions, we don't have the syscall number only for riscv64

syzkaller/sys/linux/sys.txt.const

Line 781 in 4f67c4a

    
           __NR_renameat = 386:302, amd64:264, arm:329, arm64:38, mips64le:5254, ppc64le:293, riscv64:???, s390x:295

Apparently, it still worked when syz-execprog used the syscall ID from there?

Overall, I think it's probably okay that syz-manager is more strict about what it finds in sys/*/test than syz-execprog is about what it takes as input. syz-execprog was built specifically to tolerate the input that might have been interleaved with console output (that used to be the case up until last year), so even if we start considering the actually enabled syscalls (which we probably should do, indeed), we would just skip those particular calls and go on.

syzkaller/tools/syz-execprog/execprog.go

Line 213 in 4f67c4a

    
           func (ctx *Context) machineChecked(features flatrpc.Feature, syscalls map[*prog.Syscall]bool) queue.Source {

In the fuzzing mode, syz-manager cuts all calls that are not in the list of the enabled ones. So, without the renameat call, your test was just not executed the way you expected it to.

syzkaller/pkg/manager/seeds.go

Lines 304 to 315 in 4f67c4a

    
           func FilterCandidates(candidates []fuzzer.Candidate, syscalls map[*prog.Syscall]bool, 
        
           	dropMinimize bool) FilteredCandidates { 
        
           	var ret FilteredCandidates 
        
           	for _, item := range candidates { 
        
           		if !item.Prog.OnlyContains(syscalls) { 
        
           			ret.ModifiedHashes = append(ret.ModifiedHashes, hash.String(item.Prog.Serialize())) 
        
           			// We cut out the disabled syscalls and retriage/minimize what remains from the prog. 
        
           			// The original prog will be deleted from the corpus. 
        
           			if dropMinimize { 
        
           				item.Flags &= ^fuzzer.ProgMinimized 
        
           			} 
        
           			item.Prog.FilterInplace(syscalls)

a-nogikh · 2025-07-08T09:07:46Z

When this test will be merged in syzkaller, will syzbot report this WARN_ON_ONCE() issue? If that's the case, I guess we should wait for the fix to be merged first to avoid creating artificial syzbot report, right?

If syz-manager can now trigger the issue locally, it should definitely be able to trigger it on syzbot as well.

If you now also observe the crash during corpus triage each syzkaller restart, waiting a bit until the fix patch has reaches the kernel is probably a good idea.

l0kod · 2025-07-08T10:33:14Z

Apparently, it still worked when syz-execprog used the syscall ID from there?

Yes 🤷

l0kod · 2025-07-22T07:54:07Z

I updated the tests to improve coverage with the latest patch series, and it works as expected when I run it with syz-execprog.

However, running it with syz-manager in test mode failed with this error:

[...]
#0 [951ms] -> renameat2(0x3, 0x200000000540, 0x3, 0x200000000580, 0x0)
proc 4: got output: #0 [956ms] <- renameat2=0xffffffffffffffff errno=18 cover=9938
proc 4: got execute reply
handle completion: completed=19 output_size=6291456
proc 4: got output: umount(./0/file0/file2/file3)
umount(./0/file0/file2/file3)
proc 4: got output: umount(./0/file0/file2/file3) failed (errno 22)
proc 4: got output: loop exited with status 1
got data on response pipe in wrong state 2
proc 4: restarting subprocess, current state 2 attempts 0
proc 4: subprocess exit status 9
landlock_fs_disconnected none/cover C/repeat C/thr/cover: OK
landlock_fs_disconnected none/cover C/repeat C/thr/cover C: OK
landlock_fs_disconnected none/cover C/repeat C/thr/cover C/repeat C: FAIL: should repeat 3 times, but repeated 1, prog calls [0xc004065590 0xc0040655e0 0xc004065630 0xc004065680 0xc0040656d0 0xc004065720 0xc004065770 0xc0040657c0 0xc004065810 0xc004065860 0xc0040658b0 0xc004065900 0xc004065950 0xc0040659a0 0xc0040659f0 0xc004065a40 0xc004065a90 0xc004065ae0 0xc004065b30], info calls -1

The umount is not part of the tests and cannot succeed because the related directory was moved by the test (see first renameat2() call). I'm not sure how to fix this syzkaller issue.

syz-manager doesn't take this test as a seed and I guess this is the reason. Could you please confirm?

Test access through disconnected directory. This test should trigger a warning without this patch: https://lore.kernel.org/all/[email protected]/ An ongoing kernel patch series will be applied to change handling of disconnected directories: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Mickaël Salaün <[email protected]>

a-nogikh · 2025-07-22T12:29:30Z

I get the same error when I run $ ./bin/syz-manager -config my.cfg -mode run-tests -tests landlock_fs_disconnected.

The umount is not part of the tests and cannot succeed because the related directory was moved by the test (see first renameat2() call). I'm not sure how to fix this syzkaller issue.

The umount is coming from this common code in syz-executor:

syzkaller/executor/common_linux.h

Lines 4587 to 4592 in af30dea

    
           // One does not simply remove a directory. 
        
           // There can be mounts, so we need to try to umount. 
        
           // Moreover, a mount can be mounted several times, so we need to try to umount in a loop. 
        
           // Moreover, after umount a dir can become non-empty again, so we need another loop. 
        
           // Moreover, a mount can be re-mounted as read-only and then we will fail to make a dir empty. 
        
           static void remove_dir(const char* dir)

Specifically, from this line, I think

syzkaller/executor/common_linux.h

Lines 4677 to 4678 in af30dea

    
           if (umount2(filename, umount_flags)) 
        
           	exitf("umount(%s) failed", filename);

Given that it's called right between executing the program and reporting the result, there's a chance that the umount failure is indeed breaking the test.

syzkaller/executor/executor_runner.h

Lines 794 to 801 in af30dea

    
           auto [err, output] = ExecuteBinaryImpl(msg, dir); 
        
           if (!err.empty()) { 
        
           	char tmp[64]; 
        
           	snprintf(tmp, sizeof(tmp), " (errno %d: %s)", errno, strerror(errno)); 
        
           	err += tmp; 
        
           } 
        
           remove_dir(dir); 
        
           rpc::ExecResultRawT res;

Still, it's weird to see it return errno 22. Apparently, in this case, from unlink's EBUSY it didn't follow that something was mounted at the path.

l0kod · 2025-07-22T12:46:52Z

Unsharing the mount namespace in the test avoids this error, but the test is still not used by syz-manager, or at least it doesn't show in the coverage. 🤔

a-nogikh · 2025-07-22T13:03:23Z

Unsharing the mount namespace in the test itself is unfortunately not the most sustainable approach here - remove_dir is also used during normal fuzzing (not after every program, though).

You have likely created such a situation with these mounts that our existing executor code is just unable to properly handle it :)

but the test is still not used by syz-manager, or at least it doesn't show in the coverage. 🤔

You could run syz-manager with -debug and check whether the program appeared in the logs and what were the results of its run.

Did you see new coverage when you ran the test with syz-execprog?

l0kod · 2025-07-22T18:51:22Z

You could run syz-manager with -debug and check whether the program appeared in the logs and what were the results of its run.

syz-manager -debug doesn't list the used tests, just the output.

Did you see new coverage when you ran the test with syz-execprog?

Yes, it works.

I tested with a WARN_ON_ONCE() to be sure about the coverage, and syz-manager triggers this code, but it is not shown in the coverage, so it looks like an issue with the coverage display.

a-nogikh · 2025-07-23T14:58:33Z

syz-manager -debug doesn't list the used tests, just the output.

Yes, it won't list the tests directly, but it does list everything it has executed (alongside with debug info), and you test must have been somewhere in that big output.

I tested with a WARN_ON_ONCE() to be sure about the coverage, and syz-manager triggers this code, but it is not shown in the coverage, so it looks like an issue with the coverage display.

Could you please point to the kernel code that must have become covered, but didn't?
I'll try to reproduce it on my workstation.

a-nogikh · 2025-07-23T15:45:05Z

Regarding the umount2 failure discussed above.

So, your test

Mounts ./file0/file1 to ./file0/file6.
Does a self-bind mount of ./file0/file1/file2/file3 to itself.
Renames ./file0/file1/file2 to ./file0/file2.

We then fail both to delete ./file0/file2/file3 and to unmount it, apparently because there's no mount record for ./file0/file2/file3.

There's no mount record for ./file0/file1/file2/file3 either, though. /proc/self/mountinfo:

68 67 0:6 / /dev rw,relatime - devtmpfs devtmpfs rw,size=1076660k,nr_inodes=269165,mode=755
69 68 0:26 / /dev/pts rw,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=666
70 68 0:27 / /dev/shm rw,relatime - tmpfs tmpfs rw,mode=777
71 67 0:39 / /proc rw,relatime - proc syz-proc rw
72 67 0:25 / /sys rw,relatime - sysfs sysfs rw
73 72 0:8 / /sys/kernel/debug rw,relatime - debugfs debugfs rw
74 72 0:7 / /sys/kernel/security rw,relatime - securityfs securityfs rw
75 72 0:21 / /sys/kernel/config rw,relatime - configfs configfs rw
76 72 0:31 / /sys/fs/fuse/connections rw,relatime - fusectl fusectl rw
77 72 0:32 / /sys/fs/pstore rw,relatime - pstore none rw
78 72 0:33 / /sys/fs/bpf rw,relatime - bpf bpf rw
79 72 0:13 / /sys/kernel/tracing rw,relatime - tracefs tracefs rw
80 73 0:8 / /sys/kernel/debug rw,relatime - debugfs debugfs rw
81 71 0:30 / /proc/sys/fs/binfmt_misc rw,relatime - binfmt_misc binfmt_misc rw
82 67 0:34 / /syzcgroup/unified rw,relatime - cgroup2 none rw
83 67 0:36 / /syzcgroup/cpu rw,relatime - cgroup none rw,cpuacct,memory,hugetlb,clone_children
84 67 0:35 / /syzcgroup/net rw,relatime - cgroup none rw,blkio,devices,freezer,net_prio
47 68 0:40 / /dev/gadgetfs rw,relatime - gadgetfs gadgetfs rw
49 68 0:41 / /dev/binderfs rw,relatime - binder binder rw,max=1048576
50 67 0:42 / /0/file0 rw,relatime - tmpfs none rw
52 50 0:42 /file1 /0/file0/file6 rw,relatime - tmpfs none rw

See google/syzkaller#6164

l0kod · 2025-07-24T08:21:36Z

Could you please point to the kernel code that must have become covered, but didn't?
I'll try to reproduce it on my workstation.

You'll find the kernel source I'm using, with the addition of the WARN_ON_ONCE(1) call here: https://github.com/l0kod/linux/commits/landlock-syzkaller-debug-disco/

a-nogikh · 2025-07-28T09:47:06Z

I have built Linux from https://github.com/l0kod/linux/commits/landlock-syzkaller-debug-disco/ (using this config) and syzkaller from https://github.com/l0kod/syzkaller/tree/disconnected at c355bf8 and I get tons of crashes at

WARNING: CPU: 0 PID: 8621 at security/landlock/fs.c:1016 is_access_to_paths_allowed+0x7c4/0x1820 security/landlock/fs.c:1016

Same for 19973b5.

l0kod force-pushed the disconnected branch from febb55f to 99b262c Compare July 8, 2025 07:47

l0kod mentioned this pull request Jul 8, 2025

Fix errno for Landlock #6165

Merged

l0kod force-pushed the disconnected branch from 99b262c to 0577b0c Compare July 22, 2025 07:51

l0kod force-pushed the disconnected branch from 0577b0c to c355bf8 Compare July 22, 2025 07:56

l0kod force-pushed the disconnected branch from c355bf8 to 19973b5 Compare July 22, 2025 12:45

l0kod added a commit to l0kod/linux that referenced this pull request Jul 24, 2025

DEBUG: Add WARN_ON_ONCE() for syzkaller

a115ef4

See google/syzkaller#6164

a-nogikh mentioned this pull request Jul 28, 2025

executor: remove_dir() unable to handle disconnected directories #6227

Open

sys/linux/test: add landlock_fs_disconnected #6164

Are you sure you want to change the base?

sys/linux/test: add landlock_fs_disconnected #6164

Uh oh!

Conversation

l0kod commented Jul 7, 2025

Uh oh!

l0kod commented Jul 7, 2025

Uh oh!

a-nogikh commented Jul 7, 2025

Uh oh!

l0kod commented Jul 7, 2025

Uh oh!

a-nogikh commented Jul 7, 2025

Uh oh!

l0kod commented Jul 8, 2025

Uh oh!

l0kod commented Jul 8, 2025

Uh oh!

a-nogikh commented Jul 8, 2025

Uh oh!

a-nogikh commented Jul 8, 2025

Uh oh!

l0kod commented Jul 8, 2025

Uh oh!

l0kod commented Jul 22, 2025

Uh oh!

a-nogikh commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

l0kod commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a-nogikh commented Jul 22, 2025

Uh oh!

l0kod commented Jul 22, 2025

Uh oh!

a-nogikh commented Jul 23, 2025

Uh oh!

a-nogikh commented Jul 23, 2025

Uh oh!

l0kod commented Jul 24, 2025

Uh oh!

a-nogikh commented Jul 28, 2025

Uh oh!

Uh oh!

a-nogikh commented Jul 22, 2025 •

edited

Loading

l0kod commented Jul 22, 2025 •

edited

Loading