-
Notifications
You must be signed in to change notification settings - Fork 36
Fix various problems in close(2) usage and pwdfd handling #882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
JohnoKing
wants to merge
2
commits into
ksh93:dev
Choose a base branch
from
JohnoKing:fix-fd-closing
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Well, this patch causes the CI to fail horrifically :\ |
The test failures were caused by the new |
Changelog for this commit: include/ast.h: - Added ast_close, which endeavors to close fds while handling EINTR correctly, depending on platform specific behavior. The following three methods are implemented: 1) Guarantee a close with posix_close(3p), which is the best available method. This function is still *very* new because it's a recent addition to POSIX. 2) Add a version that only calls close once, then copes with EINTR by treating it as a non-error. Only Linux, FreeBSD and Cygwin use this right now. (Other OSes probably provide functionally equivalent leak-proof forms of close, but the documentation available for those is too vague on this matter. Perhaps I'll have to dive into the code for the open source ones.) 3) Use the tried and true while loop to repeatedly invoke close, which is currently the generic fallback. On Linux, avoiding this method is effectively a bugfix, in large part because Linux can reuse a file descriptor after it's first closed before a second close call occurs, creating the potential for dangerous race conditions. For more information, please see the relevant LWN article et seq: https://lwn.net/Articles/576478/ - Delete many <errno.h> include directives in favor of a single one included in ast.h (ast_close uses errno and is implemented as a macro in the ast.h header). src/subshell.c: - Close all of the previous subshell pwdfds after a fork to fix a file descriptor leak (first introduced in 93u+ 2012-07-27). Reproducer: $ cat /tmp/reproducer (cd /bin; (cd /usr; (cd /tmp; (cd / echo 'ls /proc/${.sh.pid}/fd; :' >/tmp/a echo '#/bin/sh' >/tmp/b cat /tmp/a >>/tmp/b chmod +x /tmp/{a,b} /tmp/a # SHOPT_SPAWN=0 /tmp/b $ SHOPT_SPAWN=0 (ulimit -t unlimited; source /tmp/a) )))) exit 0 # Avoid optimizing away subshells - Initialize sp->pwdfd to -1 in all cases (this avoids the potential for incorrectly closing stdin). - While we're at it, switch to using C99 style struct initialization. bltins/cd_pwd.c, sh/io.c: - sh_close(): Set errno to zero to avoid passing on an inherited errno. The close(2) call might fail and subsequently set errno, which makes the previous behavior bogus. - b_cd(): If the sh_close() calls in cd fail with anything other than EINTR, the errno from fchdir(2) is thrown away and replaced with close's errno. This can cause two regression test failures in builtins.sh. Prevent this by saving the errno and restoring it at the relevant sh_close calls. Error output (snipped to fit it into the commit log): builtins.sh[996]: FAIL: can cd into a directory without x permission bit (absolute path arg) (expected status 1 and match of ': cd: /tmp/ksh93.shtests 377539.24978/builtins.C/no_x_dir: \[?Permission denied]?$', got status 1 and '<SNIP>/builtins.sh[994]: cd: /tmp/ksh93.shtests.377539.24978/builtins. /no_x_dir: Success') builtins.sh[1001]: FAIL: can cd into a directory without x permission bit (relative path arg) (expected status 1 and match of ': cd: no_x_dir: \[?Permission denied]?$', got status 1 and '<SNIP>/tests/builtins.sh[999]: cd: no_x_dir: No such file or directory') This bug was noticed after sh_close was changed to reset errno before calling close. - sh_diropenat(): If openat() managed to obtain a file descriptor >10 on its first try and O_CLOEXEC is available, the fd wouldn't be registered with the shell because no sh_fcntl() call would be reached. Fix that by setting the file attributes in sh_diropenat() rather than sh_fcntl(). Consequently, we don't need to rely on sh_fcntl and thus we can use fcntl(2) directly. (cf. sh_open() in io.c) bltins/test.c: - test_stat(): If we need to stat e_dot (the current working directory) and sh.pwdfd is available, use fstat(2) because it will avoid unnecessarily reopening '.'. This change results in a small performance increase for cd(1) usage in subshells: $ cat bench/v-cdloop.ksh for ((i=0; i!=10000; i++)) do (cd /home; (cd /tmp; (cd /bin; (cd /usr; (cd /var))))) done; : $ shbench -b bench/v-cdloop.ksh -l 4 /tmp/ksh-{before,after} ------------------------------------------------------ name /tmp/ksh-before /tmp/ksh-after ------------------------------------------------------ v-cdloop.ksh 0.414 [0.412-0.416] 0.381 [0.341-0.398] ------------------------------------------------------ sh/fault.c: - sh_done(): Delete an unnecessary close call backported from ksh93v-. The operating system closes all fds for us via exit(3), so a manual close(2) call only serves to make cleanup slower. comp/arc4random.c: - _ast_getentropy(): Fix a file descriptor leak when read(2) fails. tm/tvtouch.c: - tvtouch(): Add an else branch to avoid closing the same file descriptor twice. tests/subshell.sh: - Add a regression test capable of detecting the subshell pwdfd leak going back as far as ksh93u+ 2012-07-27, where the leak is incipient.
fe0c4c1
to
3cd6613
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changelog for this commit:
include/ast.h:
Added
ast_close()
, which endeavors to close fds while handlingEINTR
correctly, depending on platform specific behavior. The following three methods are implemented:posix_close(3p)
, which is the best available method. This function is still very new because it's a recent addition to POSIX.close
once, then copes withEINTR
by treating it as a non-error. Only Linux, FreeBSD and Cygwin use this right now. (Other OSes probably provide functionally equivalent leak-proof forms ofclose
, but the documentation available for those is too vague on this matter. Perhaps I'll have to dive into the code for the open source OSes.)close
, which is currently the generic fallback. On Linux, avoiding this method is effectively a bugfix, in large part because Linux can reuse a file descriptor after it's first closed before a secondclose
call occurs, creating the potential for dangerous race conditions.1 (This probably never occurs in ksh because it's single threaded, but it's better to avoid thewhile
loop for correctness.)Delete many
<errno.h>
include directives in favor of a single one included in ast.h (ast_close
useserrno
and is implemented as a macro in the ast.h header).src/{subshell,xec}.c:
bltins/cd_pwd.c, sh/io.c:
sh_close()
: Seterrno
to zero to avoid passing on an inheritederrno
. Theclose(2)
call might fail and subsequently seterrno
, which makes the previous behavior bogus.b_cd()
: If thesh_close()
calls incd
fail with anything other thanEINTR
, theerrno
fromfchdir(2)
is thrown away and replaced with theerrno
fromclose
. This can cause two regression test failures in builtins.sh. Prevent this by saving theerrno
and restoring it at the relevantsh_close
calls.2 Error output:sh_diropenat()
: Ifopenat(2)
managed to obtain a file descriptor >10 on its first try andO_CLOEXEC
is available, the fd wouldn't be registered with the shell because nosh_fcntl()
call would be reached. Fix that by setting the file attributes insh_diropenat()
rather thansh_fcntl()
. Consequently, we don't need to rely onsh_fcntl
and thus we can usefcntl(2)
directly.3bltins/test.c:
test_stat()
: If we need to state_dot
(the current working directory) andsh.pwdfd
is available, usefstat(2)
because it will avoid unnecessarily reopening.
. This change results in a small performance increase forcd(1)
usage in subshells:sh/fault.c:
sh_done()
: Delete an unnecessaryclose
call backported from ksh93v-. The operating system closes all file descriptors for us viaexit(3)
, so a manualclose
call only serves to make cleanup slower.comp/arc4random.c:
_ast_getentropy()
: Fix a file descriptor leak whenread(2)
fails.tm/tvtouch.c:
tvtouch()
: Add an else branch to avoid closing the same file descriptor twice.tests/subshell.sh:
Footnotes
For more information, please see the relevant LWN article et seq: https://lwn.net/Articles/576478/ ↩
This bug was noticed after
sh_close
was changed to reseterrno
before callingclose
. ↩cf.
sh_open()
in io.c ↩