Skip to content

Conversation

JohnoKing
Copy link

@JohnoKing JohnoKing commented Jun 20, 2025

Changelog for this commit:

include/ast.h:

  • Added ast_close(), which endeavors to close fds while handling EINTR correctly, depending on platform specific behavior. The following three methods are implemented:

    1. Guarantee a close with posix_close(3p), which is the best available method. This function is still very new because it's a recent addition to POSIX.
    2. Add a version that only calls close once, then copes with EINTR by treating it as a non-error. Only Linux, FreeBSD and Cygwin use this right now. (Other OSes probably provide functionally equivalent leak-proof forms of close, but the documentation available for those is too vague on this matter. Perhaps I'll have to dive into the code for the open source OSes.)
    3. Use the tried and true while loop to repeatedly invoke close, which is currently the generic fallback. On Linux, avoiding this method is effectively a bugfix, in large part because Linux can reuse a file descriptor after it's first closed before a second close call occurs, creating the potential for dangerous race conditions.1 (This probably never occurs in ksh because it's single threaded, but it's better to avoid the while loop for correctness.)
  • Delete many <errno.h> include directives in favor of a single one included in ast.h (ast_close uses errno and is implemented as a macro in the ast.h header).

src/{subshell,xec}.c:

  • Close all of the previous subshell pwdfds after a fork to fix a file descriptor leak (first introduced in 93u+ 2012-07-27). Reproducer:
$ cat /tmp/reproducer
(cd /bin; (cd /usr; (cd /tmp; (cd /
    echo 'ls /proc/${.sh.pid}/fd; :' >/tmp/a
    echo '#/bin/sh' >/tmp/b
    cat /tmp/a >>/tmp/b
    chmod +x /tmp/{a,b}
    /tmp/a  # needs SHOPT_SPAWN=0 for older ksh
    /tmp/b  # needs SHOPT_SPAWN=0 for older ksh
    (ulimit -t unlimited; source /tmp/a)
))))
exit 0 # Avoid optimizing away subshells

$ ksh /tmp/reproducer
0  1  10  11  12  13  14  15  2
0  1  10  11  12  13  14  15  2
0  1  10  11  12  13  14  15  2

bltins/cd_pwd.c, sh/io.c:

  • sh_close(): Set errno to zero to avoid passing on an inherited errno. The close(2) call might fail and subsequently set errno, which makes the previous behavior bogus.
  • b_cd(): If the sh_close() calls in cd fail with anything other than EINTR, the errno from fchdir(2) is thrown away and replaced with the errno from close. This can cause two regression test failures in builtins.sh. Prevent this by saving the errno and restoring it at the relevant sh_close calls.2 Error output:
test builtins(C.UTF-8) begins at 2025-06-19+18:20:50
	builtins.sh[996]: FAIL: can cd into a directory without x permission bit (absolute path arg) (expected status 1 and match of ': cd: /tmp/ksh93.shtests.253547.19870/builtins.C.UTF-8/no_x_dir: \[?Permission denied]?$', got status 1 and '/home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/tests/builtins.sh[994]: cd: /tmp/ksh93.shtests.253547.19870/builtins.C.UTF-8/no_x_dir: Success')
	builtins.sh[1001]: FAIL: can cd into a directory without x permission bit (relative path arg) (expected status 1 and match of ': cd: no_x_dir: \[?Permission denied]?$', got status 1 and '/home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/tests/builtins.sh[999]: cd: no_x_dir: No such file or directory')
test builtins(C.UTF-8) failed at 2025-06-19+18:20:54 with exit code 2 [ 294 tests 2 errors ]
Total errors: 2
  • sh_diropenat(): If openat(2) managed to obtain a file descriptor >10 on its first try and O_CLOEXEC is available, the fd wouldn't be registered with the shell because no sh_fcntl() call would be reached. Fix that by setting the file attributes in sh_diropenat() rather than sh_fcntl(). Consequently, we don't need to rely on sh_fcntl and thus we can use fcntl(2) directly.3

bltins/test.c:

  • test_stat(): If we need to stat e_dot (the current working directory) and sh.pwdfd is available, use fstat(2) because it will avoid unnecessarily reopening .. This change results in a small performance increase for cd(1) usage in subshells:
$ cat bench/v-cdloop.ksh
for ((i=0; i!=10000; i++)) do
   (cd /home; (cd /tmp; (cd /bin; (cd /usr; (cd /var)))))
done; :
$ shbench -b bench/v-cdloop.ksh -l 4 /tmp/ksh-{before,after}
------------------------------------------------------
name          /tmp/ksh-before      /tmp/ksh-after
------------------------------------------------------
v-cdloop.ksh  0.414 [0.412-0.416]  0.381 [0.341-0.398]
------------------------------------------------------

sh/fault.c:

  • sh_done(): Delete an unnecessary close call backported from ksh93v-. The operating system closes all file descriptors for us via exit(3), so a manual close call only serves to make cleanup slower.

comp/arc4random.c:

  • _ast_getentropy(): Fix a file descriptor leak when read(2) fails.

tm/tvtouch.c:

  • tvtouch(): Add an else branch to avoid closing the same file descriptor twice.

tests/subshell.sh:

  • Add a regression test capable of detecting the subshell pwdfd leak going back as far as ksh93u+ 2012-07-27, where the leak is incipient.

Footnotes

  1. For more information, please see the relevant LWN article et seq: https://lwn.net/Articles/576478/

  2. This bug was noticed after sh_close was changed to reset errno before calling close.

  3. cf. sh_open() in io.c

@JohnoKing
Copy link
Author

JohnoKing commented Jun 20, 2025

Well, this patch causes the CI to fail horrifically :\
Guess I'll be busy debugging it now...

@JohnoKing
Copy link
Author

JohnoKing commented Jun 20, 2025

The test failures were caused by the new sh_clear_subshell_pwdfd() function closing stdin, which was because sp->pwdfd got initialized to 0 (which is wrong, it should be initialized to -1 when there's no file descriptor). That issue has been fixed and the CI now passes.

Changelog for this commit:

include/ast.h:
- Added ast_close, which endeavors to close fds while handling
  EINTR correctly, depending on platform specific behavior.
  The following three methods are implemented:
    1) Guarantee a close with posix_close(3p), which is the best
       available method. This function is still *very* new because
       it's a recent addition to POSIX.
    2) Add a version that only calls close once, then copes with EINTR
       by treating it as a non-error. Only Linux, FreeBSD and Cygwin
       use this right now. (Other OSes probably provide functionally
       equivalent leak-proof forms of close, but the documentation
       available for those is too vague on this matter. Perhaps I'll
       have to dive into the code for the open source ones.)
    3) Use the tried and true while loop to repeatedly invoke close,
       which is currently the generic fallback. On Linux, avoiding this
       method is effectively a bugfix, in large part because Linux can
       reuse a file descriptor after it's first closed before a second
       close call occurs, creating the potential for dangerous race
       conditions. For more information, please see the relevant LWN
       article et seq:
       https://lwn.net/Articles/576478/
- Delete many <errno.h> include directives in favor of a single one
  included in ast.h (ast_close uses errno and is implemented as a macro
  in the ast.h header).

src/subshell.c:
- Close all of the previous subshell pwdfds after a fork
  to fix a file descriptor leak (first introduced in
  93u+ 2012-07-27). Reproducer:
     $ cat /tmp/reproducer
     (cd /bin; (cd /usr; (cd /tmp; (cd /
         echo 'ls /proc/${.sh.pid}/fd; :' >/tmp/a
         echo '#/bin/sh' >/tmp/b
         cat /tmp/a >>/tmp/b
         chmod +x /tmp/{a,b}
         /tmp/a  # SHOPT_SPAWN=0
         /tmp/b  $ SHOPT_SPAWN=0
         (ulimit -t unlimited; source /tmp/a)
     ))))
     exit 0 # Avoid optimizing away subshells
- Initialize sp->pwdfd to -1 in all cases (this avoids the potential for
  incorrectly closing stdin).
  - While we're at it, switch to using C99 style struct initialization.

bltins/cd_pwd.c, sh/io.c:
- sh_close(): Set errno to zero to avoid passing on an inherited errno.
  The close(2) call might fail and subsequently set errno, which makes
  the previous behavior bogus.
- b_cd(): If the sh_close() calls in cd fail with anything other
  than EINTR, the errno from fchdir(2) is thrown away and replaced
  with close's errno. This can cause two regression test
  failures in builtins.sh. Prevent this by saving the errno
  and restoring it at the relevant sh_close calls.
  Error output (snipped to fit it into the commit log):
  builtins.sh[996]: FAIL: can cd into a directory without x
                    permission bit (absolute path arg) (expected
                    status 1 and match of ': cd: /tmp/ksh93.shtests
                    377539.24978/builtins.C/no_x_dir: \[?Permission
                    denied]?$', got status 1 and '<SNIP>/builtins.sh[994]:
                    cd: /tmp/ksh93.shtests.377539.24978/builtins.
                    /no_x_dir: Success')
  builtins.sh[1001]: FAIL: can cd into a directory without x permission bit
                     (relative path arg) (expected status 1 and match of
                     ': cd: no_x_dir: \[?Permission denied]?$', got status 1
                     and '<SNIP>/tests/builtins.sh[999]: cd: no_x_dir: No
                     such file or directory')
  This bug was noticed after sh_close was changed to reset errno
  before calling close.
- sh_diropenat(): If openat() managed to obtain a file descriptor >10
  on its first try and O_CLOEXEC is available, the fd wouldn't be
  registered with the shell because no sh_fcntl() call would be reached.
  Fix that by setting the file attributes in sh_diropenat() rather than
  sh_fcntl(). Consequently, we don't need to rely on sh_fcntl and thus
  we can use fcntl(2) directly. (cf. sh_open() in io.c)

bltins/test.c:
- test_stat(): If we need to stat e_dot (the current working directory)
  and sh.pwdfd is available, use fstat(2) because it will avoid
  unnecessarily reopening '.'. This change results in a small
  performance increase for cd(1) usage in subshells:
     $ cat bench/v-cdloop.ksh
     for ((i=0; i!=10000; i++)) do
        (cd /home; (cd /tmp; (cd /bin; (cd /usr; (cd /var)))))
     done; :
     $ shbench -b bench/v-cdloop.ksh -l 4 /tmp/ksh-{before,after}
     ------------------------------------------------------
     name          /tmp/ksh-before      /tmp/ksh-after
     ------------------------------------------------------
     v-cdloop.ksh  0.414 [0.412-0.416]  0.381 [0.341-0.398]
     ------------------------------------------------------

sh/fault.c:
- sh_done(): Delete an unnecessary close call backported from ksh93v-.
  The operating system closes all fds for us via exit(3), so a manual
  close(2) call only serves to make cleanup slower.

comp/arc4random.c:
- _ast_getentropy(): Fix a file descriptor leak when read(2) fails.

tm/tvtouch.c:
- tvtouch(): Add an else branch to avoid closing the same file
  descriptor twice.

tests/subshell.sh:
- Add a regression test capable of detecting the subshell pwdfd leak
  going back as far as ksh93u+ 2012-07-27, where the leak is incipient.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant