-
Notifications
You must be signed in to change notification settings - Fork 456
util: Handle signal interrupts in uffd memory monitor #11635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If unhandled, a signal such as SIGPROF sent to the thread that is running ofi_uffd_handler will cause it to exit early. Signed-off-by: Jianxin Xiong <[email protected]>
55f10d1 to
3c22532
Compare
|
@swelch Could you or someone else from HPE review this PR? The authors that made most recent changes to this file were from HPE but they have moved or retired. |
| pthread_mutex_unlock(&mm_lock); | ||
| pthread_rwlock_unlock(&mm_list_rwlock); | ||
| if (errno != EAGAIN) | ||
| if (errno != EAGAIN && errno != EINTR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a review but thought it was something worth raising. This is from the man page of read:
On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number.
It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes
are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a
terminal), or because read() was interrupted by a signal.
As far as I can tell, there a chance that the size/amount read from the syscall might not exactly equal if interrupted by a signal. I do not know much about libfabric to know how this might affect this case but could there be a chance that a signal sent here could allow for only a partial read? Would a partial read here potential affect future reads in the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can be an issue but according to the man page:
If a read() is interrupted by a signal before it reads any data, it shall return −1 with errno set to [EINTR].
If a read() is interrupted by a signal after it has successfully read some data, it shall return the number of bytes read.
Partial read would not set errno to EINTR so the thread would exit early as before.
Let me see if I can find a good way to handle this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the userfaultfd man page https://www.man7.org/linux/man-pages/man2/userfaultfd.2.html, reading from the fd returns 1 or more msg structures based on available events and the buffer size. Passing a buffer smaller than the msg structure would return error. This implies that the partial reading scenario is not going to happen (otherwise application has no way to recover).
|
Intel CI failure is an unrelated known issue (Bad file descriptor). |
If unhandled, a signal such as SIGPROF sent to the thread that is running ofi_uffd_handler will cause it to exit early.
Fix #11627