Skip to content

Conversation

@GnomedDev
Copy link
Contributor

@GnomedDev GnomedDev commented Apr 6, 2024

Removes an allocation pre-main by just not storing anything in std::thread::Thread for the main thread.

  • The thread name can just be a hard coded literal, as was done in Remove rt::init allocation for thread name #123433.
  • Storing ThreadId and Parker in a static that is initialized once at startup. This uses SyncUnsafeCell and MaybeUninit as this is quite performance critical and we don't need synchronization or to store a tag value and possibly leave in a panic.

@rustbot
Copy link
Collaborator

rustbot commented Apr 6, 2024

r? @Nilstrieb

rustbot has assigned @Nilstrieb.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 6, 2024
@GnomedDev
Copy link
Contributor Author

I just checked and Option<Pin<Arc<T>>> does indeed niche, so this doesn't grow the size of std::thread::Thread.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@GnomedDev GnomedDev force-pushed the remove-initial-arc branch from d5a081b to d5b8b00 Compare April 6, 2024 14:27
@saethlin
Copy link
Member

saethlin commented Apr 6, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 6, 2024
@bors
Copy link
Collaborator

bors commented Apr 6, 2024

⌛ Trying commit d5b8b00 with merge 666bbff...

@bors
Copy link
Collaborator

bors commented Apr 6, 2024

☀️ Try build successful - checks-actions
Build commit: 666bbff (666bbff29cc26856cc869d4b7e16f6843b105c4b)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (666bbff): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.4%, 0.4%] 1
Regressions ❌
(secondary)
1.5% [1.5%, 1.5%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.4%, 0.4%] 1

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.0% [2.6%, 5.2%] 3
Regressions ❌
(secondary)
4.3% [2.9%, 6.7%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 4.0% [2.6%, 5.2%] 3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.8% [1.4%, 2.2%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.1% [-2.1%, -2.1%] 1
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.1%] 23
Regressions ❌
(secondary)
0.1% [0.0%, 0.1%] 35
Improvements ✅
(primary)
-0.1% [-0.3%, -0.0%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [-0.3%, 0.1%] 25

Bootstrap: 666.761s -> 666.789s (0.00%)
Artifact size: 318.27 MiB -> 318.23 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 6, 2024
Copy link
Member

@Noratrieb Noratrieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the increased complexity and unsafety.. if you have some good justification for why this is important that would be great, but I'm inclined to accept it even without that, it certainly feels good to have this property.

@GnomedDev
Copy link
Contributor Author

The increased complexity is a bit sad, but this is already a complex and unsafe process to initialise the basics for the runtime, so I felt that the increased performance and decreased compile time was worth a small amount of well documented unsafety.

@GnomedDev
Copy link
Contributor Author

Hmm, looking at the actual perf run, it seems quite negative which is certainly unexpected. How is this commonly debugged, as I don't want to go off vibes?

@Noratrieb
Copy link
Member

Run the cachegrind command to see where in the compiler the diff occurs. Though FWIW, I would expect these results to be noise and wouldn't chase them further myself - I'd just treat it as "makes no difference".

@Noratrieb
Copy link
Member

Noratrieb commented Apr 6, 2024

that the increased performance and decreased compile time

so yeah, no real decreased compile time. as for increased performance, I doubt that this will be measurable, maybe fn main() {} (which is a pretty useless program). If you have a benchmark where this helps that would be great to have.

@GnomedDev
Copy link
Contributor Author

Okay, I don't have a benchmark (I never have a benchmark). Would you like me to rewrite this using OnceLock, just to see if that perf run is also neutral?

@GnomedDev GnomedDev force-pushed the remove-initial-arc branch 2 times, most recently from 778330b to ab8eba1 Compare April 7, 2024 11:17
@GnomedDev
Copy link
Contributor Author

Sorted the existing review comments, just waiting on a reply to my last comment.

@bors
Copy link
Collaborator

bors commented Apr 14, 2024

☔ The latest upstream changes (presumably #123913) made this pull request unmergeable. Please resolve the merge conflicts.

@GnomedDev GnomedDev force-pushed the remove-initial-arc branch from ab8eba1 to 2c45b39 Compare April 14, 2024 14:59
@GnomedDev
Copy link
Contributor Author

Okay, @Nilstrieb I've been trying for the last week different ways to make this less unsafe and complex but it doesn't seem possible with the "Parker must be initialized in place" requirement. I cannot initialize a OnceLock or an Option in-place without increasing complexity significantly, so this seems like the least complex (and most performant) way to do this.

@bors
Copy link
Collaborator

bors commented Oct 24, 2024

☀️ Test successful - checks-actions
Approved by: Noratrieb
Pushing f61306d to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 24, 2024
@bors bors merged commit f61306d into rust-lang:master Oct 24, 2024
@rustbot rustbot added this to the 1.84.0 milestone Oct 24, 2024
@GnomedDev GnomedDev deleted the remove-initial-arc branch October 24, 2024 18:01
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f61306d): comparison URL.

Overall result: ❌ regressions - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.1% [0.1%, 0.2%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [0.1%, 0.2%] 2

Max RSS (memory usage)

Results (primary -0.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.9% [1.2%, 4.6%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-3.5% [-4.7%, -2.2%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.3% [-4.7%, 4.6%] 4

Cycles

Results (secondary -2.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.1% [-2.1%, -2.1%] 1
All ❌✅ (primary) - - 0

Binary size

Results (primary 0.0%, secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.1%] 19
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 35
Improvements ✅
(primary)
-0.1% [-0.3%, -0.0%] 7
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [-0.3%, 0.1%] 26

Bootstrap: 780.805s -> 780.742s (-0.01%)
Artifact size: 333.57 MiB -> 333.58 MiB (0.00%)

#[derive(Clone)]
enum Inner {
/// Represents the main thread. May only be constructed by Thread::new_main.
Main(&'static (ThreadId, Parker)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the main thread is a static reference, why not just have this be an Option<Pin<Arc<OtherInner>>>? Every None match can refer to the static.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.