-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
tac, tail, dd: detect closed stdin before Rust sanitizes it to /dev/null #9664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
GNU testsuite comparison: |
32ab405 to
e0bafa3
Compare
e0bafa3 to
bfdfe19
Compare
|
GNU testsuite comparison: |
|
I'm not sure we should do this, the shell shouldn't allow that in the first place, this is not compliant with posix. |
|
For tracking this would be the same approach of getting the PIPE signal as here: #9657 Maybe my understanding of the discussion is not complete, @anastygnome are you saying that this UUTILS implementation is not Posix compliant and that the GNU Coreutils is compliant in regards to this? Or are you saying that both the GNU Coreutils implementation and this implementation are non-compliant and that we should diverge from the GNU Coreutils implementation |
bfdfe19 to
d61c41f
Compare
d61c41f to
dfabb4b
Compare
CodSpeed Performance ReportMerging #9664 will improve performances by 2.17%Comparing Summary
Benchmarks breakdown
Footnotes
|
|
GNU testsuite comparison: |
impressive! |
Where does POSIX disallow This behavior has been supported the Bourne shell for ages. Even Solaris My personal opinion is that Rust really should have a way to disable the messing around it does with signal actions and file descriptors before |
dfabb4b to
2c10c5f
Compare
|
GNU testsuite comparison: |
So, to recap the issue Basically this point of the posix standard was ambiguous before 2017. Since 2017, the posix standard says that if any of the 3 standards file descriptiors are closed, we are not in a posix compliant environment. So the question is : Do we consider posix post 2017 as the new norm, or do we add this check as a continuing workaround ? I'll quote the posix standard later but it has been there on older issues. |
| #[cfg(unix)] | ||
| #[allow(clippy::missing_safety_doc)] | ||
| pub unsafe extern "C" fn capture_stdio_state() { | ||
| use nix::libc; | ||
| unsafe { | ||
| STDIN_WAS_CLOSED.store( | ||
| libc::fcntl(libc::STDIN_FILENO, libc::F_GETFD) == -1, | ||
| Ordering::Relaxed, | ||
| ); | ||
| STDOUT_WAS_CLOSED.store( | ||
| libc::fcntl(libc::STDOUT_FILENO, libc::F_GETFD) == -1, | ||
| Ordering::Relaxed, | ||
| ); | ||
| STDERR_WAS_CLOSED.store( | ||
| libc::fcntl(libc::STDERR_FILENO, libc::F_GETFD) == -1, | ||
| Ordering::Relaxed, | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For efficiency we should do something similar to the fastpath shown in this function. It is the function that reopens the FDs in the standard library.
how about this
| #[cfg(unix)] | |
| #[allow(clippy::missing_safety_doc)] | |
| pub unsafe extern "C" fn capture_stdio_state() { | |
| use nix::libc; | |
| unsafe { | |
| STDIN_WAS_CLOSED.store( | |
| libc::fcntl(libc::STDIN_FILENO, libc::F_GETFD) == -1, | |
| Ordering::Relaxed, | |
| ); | |
| STDOUT_WAS_CLOSED.store( | |
| libc::fcntl(libc::STDOUT_FILENO, libc::F_GETFD) == -1, | |
| Ordering::Relaxed, | |
| ); | |
| STDERR_WAS_CLOSED.store( | |
| libc::fcntl(libc::STDERR_FILENO, libc::F_GETFD) == -1, | |
| Ordering::Relaxed, | |
| ); | |
| } | |
| } | |
| /// # Safety | |
| /// | |
| /// Must be called very early in process startup, before the runtime may reopen | |
| /// closed standard file descriptors. Uses low-level libc APIs. | |
| /// This code is just checking the state of the three standard fds and has no side-effects. | |
| #[cfg(unix)] | |
| pub unsafe extern "C" fn capture_stdio_state() { | |
| use nix::libc; | |
| // Prefer poll() where it reliably reports POLLNVAL | |
| #[cfg(not(any( | |
| miri, | |
| target_os = "emscripten", | |
| target_os = "fuchsia", | |
| target_os = "vxworks", | |
| target_os = "macos", | |
| target_os = "ios", | |
| target_os = "watchos", | |
| target_os = "redox", | |
| target_os = "l4re", | |
| target_os = "horizon", | |
| )))] | |
| { | |
| unsafe { | |
| let mut pfds = [ | |
| libc::pollfd { | |
| fd: libc::STDIN_FILENO, | |
| events: 0, | |
| revents: 0, | |
| }, | |
| libc::pollfd { | |
| fd: libc::STDOUT_FILENO, | |
| events: 0, | |
| revents: 0, | |
| }, | |
| libc::pollfd { | |
| fd: libc::STDERR_FILENO, | |
| events: 0, | |
| revents: 0, | |
| }, | |
| ]; | |
| let ok = loop { | |
| match libc::poll(pfds.as_mut_ptr(), pfds.len() as _, 0) { | |
| -1 => match *libc::__errno_location() { | |
| libc::EINTR => continue, | |
| libc::EINVAL | libc::ENOMEM | libc::EAGAIN => break false, | |
| _ => libc::abort(), | |
| }, | |
| _ => break true, | |
| } | |
| }; | |
| if ok { | |
| STDIN_WAS_CLOSED.store(pfds[0].revents & libc::POLLNVAL != 0, Ordering::Relaxed); | |
| STDOUT_WAS_CLOSED.store(pfds[1].revents & libc::POLLNVAL != 0, Ordering::Relaxed); | |
| STDERR_WAS_CLOSED.store(pfds[2].revents & libc::POLLNVAL != 0, Ordering::Relaxed); | |
| return; | |
| } | |
| } | |
| } | |
| // Fallback: fcntl() | |
| unsafe { | |
| STDIN_WAS_CLOSED.store( | |
| libc::fcntl(libc::STDIN_FILENO, libc::F_GETFD) == -1, | |
| Ordering::Relaxed, | |
| ); | |
| STDOUT_WAS_CLOSED.store( | |
| libc::fcntl(libc::STDOUT_FILENO, libc::F_GETFD) == -1, | |
| Ordering::Relaxed, | |
| ); | |
| STDERR_WAS_CLOSED.store( | |
| libc::fcntl(libc::STDERR_FILENO, libc::F_GETFD) == -1, | |
| Ordering::Relaxed, | |
| ); | |
| } | |
| } |
|
Also should be done for cat ;) |
|
I'm open either way to the approach, was hoping to get a maintainers opinion on which approach they'd be interested in? |
|
@sylvestre , what do you think? |
|
@sylvestre any chance we could get your guidance on this general topic of reading signals before the rust runtime overrides them. I believe there's more than 20 separate issues in the queue related to this because if we use this approach we can read the original state of the file descriptors and read the different SIGPIPE signals correctly, and once we have a PR in that the maintainers are aligned on the approach for how to read these signals we can begin closing all of those issues: To get them connected I will start collecting the issues that are solvable by the init_data approach of reading things, either through this PR or through the similar SIGPIPE PR: #9657 |
|
I would prefer we match what GNU is doing now :) |
|
So the end behaviour with this signal handling is the only solution that I've seen so far that fully matches the end behaviour of the GNU utilities so far. We have a bunch of hacks previously with the stdin, stderr in specific utilities for specific edge cases but this one is the one that will match all of the behaviors |
|
To say that this approach is the solution that handles matching the GNU implementation in all of these use cases as an example, theres many more in the issues folder: Closed file descriptor handling:
SigPIPE signal handling: |
|
@ChrisDryden So, after checking in an env and spinning lots of uutils in parallel, I've seen performance degradation when doing 3 syscalls (this PR), vs my solution. Will post the results here shortly but we kinda want this, especially in cases where the system is already under load, but we are able to use |
| pub(crate) fn print_io_lines(&self) { | ||
| let mut stderr = std::io::stderr(); | ||
| self.write_io_lines(&mut stderr).unwrap(); | ||
| if self.write_io_lines(&mut stderr).is_err() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would have preferred to see this change into a different pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, splitting it now
Using the same approach that was identified for the pipeline signals, we can retrieve the state of stdin and stdout before the rust runtime overrides the fd to /dev/null
This means that we will be able to both preserve the pre-existing behaviour for passing in /dev/null and also allow us to raise the same error message and code when the closed fd is passed to the stdin or stdout as the GNU implementation.