Skip to content

Conversation

@ChrisDryden
Copy link
Collaborator

@ChrisDryden ChrisDryden commented Dec 15, 2025

Using the same approach that was identified for the pipeline signals, we can retrieve the state of stdin and stdout before the rust runtime overrides the fd to /dev/null

This means that we will be able to both preserve the pre-existing behaviour for passing in /dev/null and also allow us to raise the same error message and code when the closed fd is passed to the stdin or stdout as the GNU implementation.

@ChrisDryden ChrisDryden changed the title fix: detect closed stdin before Rust sanitizes it to /dev/null WIP fix: detect closed stdin before Rust sanitizes it to /dev/null Dec 15, 2025
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/tac/tac-2-nonseekable is no longer failing!
Congrats! The gnu test tests/tail/follow-stdin is no longer failing!

@ChrisDryden ChrisDryden force-pushed the fix-stdin-closed-detection branch from 32ab405 to e0bafa3 Compare December 15, 2025 17:04
@ChrisDryden ChrisDryden changed the title WIP fix: detect closed stdin before Rust sanitizes it to /dev/null tac, tail: detect closed stdin before Rust sanitizes it to /dev/null Dec 15, 2025
@ChrisDryden ChrisDryden marked this pull request as ready for review December 15, 2025 17:05
@ChrisDryden ChrisDryden force-pushed the fix-stdin-closed-detection branch from e0bafa3 to bfdfe19 Compare December 15, 2025 17:10
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/tac/tac-2-nonseekable is no longer failing!
Congrats! The gnu test tests/tail/follow-stdin is no longer failing!

@anastygnome
Copy link
Contributor

anastygnome commented Dec 15, 2025

I'm not sure we should do this, the shell shouldn't allow that in the first place, this is not compliant with posix.
@sylvestre should we import such technical debt in our tools?
See rust-lang/rust#108139 (comment)

@ChrisDryden
Copy link
Collaborator Author

For tracking this would be the same approach of getting the PIPE signal as here: #9657
also has discussions on: #5906 (comment)

Maybe my understanding of the discussion is not complete, @anastygnome are you saying that this UUTILS implementation is not Posix compliant and that the GNU Coreutils is compliant in regards to this? Or are you saying that both the GNU Coreutils implementation and this implementation are non-compliant and that we should diverge from the GNU Coreutils implementation

@ChrisDryden ChrisDryden force-pushed the fix-stdin-closed-detection branch from bfdfe19 to d61c41f Compare December 15, 2025 20:27
@ChrisDryden ChrisDryden changed the title tac, tail: detect closed stdin before Rust sanitizes it to /dev/null tac, tail, dd: detect closed stdin before Rust sanitizes it to /dev/null Dec 15, 2025
@ChrisDryden ChrisDryden force-pushed the fix-stdin-closed-detection branch from d61c41f to dfabb4b Compare December 15, 2025 20:35
@codspeed-hq
Copy link

codspeed-hq bot commented Dec 15, 2025

CodSpeed Performance Report

Merging #9664 will improve performances by 2.17%

Comparing ChrisDryden:fix-stdin-closed-detection (d61c41f) with main (06d843f)

Summary

⚡ 1 improvement
✅ 94 untouched
⏩ 38 skipped1

Benchmarks breakdown

Benchmark BASE HEAD Change
du_summarize_balanced_tree[(5, 4, 10)] 8.5 ms 8.3 ms +2.17%

Footnotes

  1. 38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/dd/stderr is no longer failing!
Congrats! The gnu test tests/tac/tac-2-nonseekable is no longer failing!
Congrats! The gnu test tests/tail/follow-stdin is no longer failing!

@sylvestre
Copy link
Contributor

GNU testsuite comparison:

Congrats! The gnu test tests/dd/stderr is no longer failing!
Congrats! The gnu test tests/tac/tac-2-nonseekable is no longer failing!
Congrats! The gnu test tests/tail/follow-stdin is no longer failing!

impressive!

@collinfunk
Copy link

I'm not sure we should do this, the shell shouldn't allow that in the first place, this is not compliant with posix.
@sylvestre should we import such technical debt in our tools?

Where does POSIX disallow sh from closing file descriptors?

This behavior has been supported the Bourne shell for ages. Even Solaris /bin/sh supports it:

$ gyes >&-
gyes: standard output: Bad file number
gyes: write error: Bad file number

My personal opinion is that Rust really should have a way to disable the messing around it does with signal actions and file descriptors before main().

@ChrisDryden ChrisDryden force-pushed the fix-stdin-closed-detection branch from dfabb4b to 2c10c5f Compare December 15, 2025 21:02
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/dd/stderr is no longer failing!
Congrats! The gnu test tests/tac/tac-2-nonseekable is no longer failing!
Congrats! The gnu test tests/tail/follow-stdin is no longer failing!

@anastygnome
Copy link
Contributor

anastygnome commented Dec 15, 2025

For tracking this would be the same approach of getting the PIPE signal as here: #9657 also has discussions on: #5906 (comment)

Maybe my understanding of the discussion is not complete, @anastygnome are you saying that this UUTILS implementation is not Posix compliant and that the GNU Coreutils is compliant in regards to this? Or are you saying that both the GNU Coreutils implementation and this implementation are non-compliant and that we should diverge from the GNU Coreutils implementation

So, to recap the issue

Basically this point of the posix standard was ambiguous before 2017.
Which is why the shell supports it, and gnu coreutils had to work around it

Since 2017, the posix standard says that if any of the 3 standards file descriptiors are closed, we are not in a posix compliant environment.
Rust works around that by reopening said file descriptors on /dev/null.

So the question is :

Do we consider posix post 2017 as the new norm, or do we add this check as a continuing workaround ?

I'll quote the posix standard later but it has been there on older issues.

Comment on lines +440 to +458
#[cfg(unix)]
#[allow(clippy::missing_safety_doc)]
pub unsafe extern "C" fn capture_stdio_state() {
use nix::libc;
unsafe {
STDIN_WAS_CLOSED.store(
libc::fcntl(libc::STDIN_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
STDOUT_WAS_CLOSED.store(
libc::fcntl(libc::STDOUT_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
STDERR_WAS_CLOSED.store(
libc::fcntl(libc::STDERR_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
}
}
Copy link
Contributor

@anastygnome anastygnome Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For efficiency we should do something similar to the fastpath shown in this function. It is the function that reopens the FDs in the standard library.

how about this

Suggested change
#[cfg(unix)]
#[allow(clippy::missing_safety_doc)]
pub unsafe extern "C" fn capture_stdio_state() {
use nix::libc;
unsafe {
STDIN_WAS_CLOSED.store(
libc::fcntl(libc::STDIN_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
STDOUT_WAS_CLOSED.store(
libc::fcntl(libc::STDOUT_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
STDERR_WAS_CLOSED.store(
libc::fcntl(libc::STDERR_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
}
}
/// # Safety
///
/// Must be called very early in process startup, before the runtime may reopen
/// closed standard file descriptors. Uses low-level libc APIs.
/// This code is just checking the state of the three standard fds and has no side-effects.
#[cfg(unix)]
pub unsafe extern "C" fn capture_stdio_state() {
use nix::libc;
// Prefer poll() where it reliably reports POLLNVAL
#[cfg(not(any(
miri,
target_os = "emscripten",
target_os = "fuchsia",
target_os = "vxworks",
target_os = "macos",
target_os = "ios",
target_os = "watchos",
target_os = "redox",
target_os = "l4re",
target_os = "horizon",
)))]
{
unsafe {
let mut pfds = [
libc::pollfd {
fd: libc::STDIN_FILENO,
events: 0,
revents: 0,
},
libc::pollfd {
fd: libc::STDOUT_FILENO,
events: 0,
revents: 0,
},
libc::pollfd {
fd: libc::STDERR_FILENO,
events: 0,
revents: 0,
},
];
let ok = loop {
match libc::poll(pfds.as_mut_ptr(), pfds.len() as _, 0) {
-1 => match *libc::__errno_location() {
libc::EINTR => continue,
libc::EINVAL | libc::ENOMEM | libc::EAGAIN => break false,
_ => libc::abort(),
},
_ => break true,
}
};
if ok {
STDIN_WAS_CLOSED.store(pfds[0].revents & libc::POLLNVAL != 0, Ordering::Relaxed);
STDOUT_WAS_CLOSED.store(pfds[1].revents & libc::POLLNVAL != 0, Ordering::Relaxed);
STDERR_WAS_CLOSED.store(pfds[2].revents & libc::POLLNVAL != 0, Ordering::Relaxed);
return;
}
}
}
// Fallback: fcntl()
unsafe {
STDIN_WAS_CLOSED.store(
libc::fcntl(libc::STDIN_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
STDOUT_WAS_CLOSED.store(
libc::fcntl(libc::STDOUT_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
STDERR_WAS_CLOSED.store(
libc::fcntl(libc::STDERR_FILENO, libc::F_GETFD) == -1,
Ordering::Relaxed,
);
}
}

@anastygnome
Copy link
Contributor

Also should be done for cat ;)

@ChrisDryden
Copy link
Collaborator Author

I'm open either way to the approach, was hoping to get a maintainers opinion on which approach they'd be interested in?

@anastygnome
Copy link
Contributor

@sylvestre , what do you think?

@ChrisDryden
Copy link
Collaborator Author

@sylvestre any chance we could get your guidance on this general topic of reading signals before the rust runtime overrides them. I believe there's more than 20 separate issues in the queue related to this because if we use this approach we can read the original state of the file descriptors and read the different SIGPIPE signals correctly, and once we have a PR in that the maintainers are aligned on the approach for how to read these signals we can begin closing all of those issues:

To get them connected I will start collecting the issues that are solvable by the init_data approach of reading things, either through this PR or through the similar SIGPIPE PR: #9657

@sylvestre
Copy link
Contributor

I would prefer we match what GNU is doing now :)

@ChrisDryden
Copy link
Collaborator Author

So the end behaviour with this signal handling is the only solution that I've seen so far that fully matches the end behaviour of the GNU utilities so far. We have a bunch of hacks previously with the stdin, stderr in specific utilities for specific edge cases but this one is the one that will match all of the behaviors

@anastygnome
Copy link
Contributor

@ChrisDryden So, after checking in an env and spinning lots of uutils in parallel, I've seen performance degradation when doing 3 syscalls (this PR), vs my solution.

Will post the results here shortly but we kinda want this, especially in cases where the system is already under load, but we are able to use poll()

pub(crate) fn print_io_lines(&self) {
let mut stderr = std::io::stderr();
self.write_io_lines(&mut stderr).unwrap();
if self.write_io_lines(&mut stderr).is_err() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would have preferred to see this change into a different pr

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, splitting it now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants