-
Notifications
You must be signed in to change notification settings - Fork 137
8366717: [lworld] Cleanup defensive fixing of JDK-8365996 #1824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: lworld
Are you sure you want to change the base?
Conversation
|
👋 Welcome back mchevalier! A progress list of the required criteria for merging this PR into |
|
@marc-chevalier This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 94 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Webrevs
|
TobiHartmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed analysis Marc! Changes look good to me, I just added a few comments.
| if (tmp3 != noreg) { | ||
| // TODO 8366717 This change is from before the 'tmp3' arg was added to mainline, check if it's still needed. Same on x64. Also, this should be a __ lea | ||
| // Called by MacroAssembler::pack_inline_helper. We cannot corrupt the dst.base() register | ||
| // When tmp3 is given, we cannot corrupt the dst.base() register (from MacroAssembler::pack_inline_helper or do_oop_store) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't look like an accident. If we remove the if-branch, tests are totally on fire. Let's keep it.
It's only the __ mov(tmp3, dst.base()); that could potentially be removed, right? Not the entire branch. If the mov is still needed, should it be guarded by InlineTypePassFieldsAsArgs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can just remove the mov part.
I don't really understand how to interpret that. We have
__ mov(tmp3, dst.base());
store_check(masm, tmp3, dst);do you suggest we write
store_check(masm, tmp3, dst);? But then, tmp3 is not set to anything meaningful yet. Or
store_check(masm, dst.base(), dst);But that is exactly the else-branch. Moreover, I've tried to guard and it makes some test fails. For instance, we can come from
CardTableBarrierSetAssembler::oop_store_at (cardTableBarrierSetAssembler_x86.cpp:184)
CardTableBarrierSetAssembler::store_at (cardTableBarrierSetAssembler_x86.cpp:90)
MacroAssembler::store_heap_oop (macroAssembler_x86.cpp:5515)
do_oop_store (templateTable_x86.cpp:148)
(called, for instance, from TemplateTable::putfield_or_static_helper (templateTable_x86.cpp:2964))
that doesn't need InlineTypePassFieldsAsArgs but will give tmp3 and if we don't use it here (for instance, if I replace the condition by InlineTypePassFieldsAsArgs && tmp3 != noreg), we get various test failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why it's not a problem on mainline, tho. And I agree it's weird. I've looked, but it's hard to investigate why something works rather than not (on a somewhat diverging codebase). I spent some time on that, but eventually decided it's not that critical, as long as we have a way to make it work for Valhalla too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was suggesting that we should do what mainline does if !InlineTypePassFieldsAsArgs, would that make sense?
I.e. do store_check(masm, dst.base(), dst);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the else-branch, right? Right now we have
if (tmp3 != noreg) {
__ mov(tmp3, dst.base());
store_check(masm, tmp3, dst);
} else {
// It's OK to corrupt the dst.base() register.
store_check(masm, dst.base(), dst);
}If I understand well, you're suggesting to write
if (InlineTypePassFieldsAsArgs) {
if (tmp3 != noreg) {
__ mov(tmp3, dst.base());
store_check(masm, tmp3, dst);
} else {
// It's OK to corrupt the dst.base() register.
store_check(masm, dst.base(), dst);
}
} else {
// as mainline
store_check(masm, dst.base(), dst);
}but to me, it looks equivalent to
if (InlineTypePassFieldsAsArgs && tmp3 != noreg) {
__ mov(tmp3, dst.base());
store_check(masm, tmp3, dst);
} else {
// It's OK to corrupt the dst.base() register.
store_check(masm, dst.base(), dst);
}since both else-branch are identical.
And this, I experimentally found that it's failing quite a lot, with backtraces as I mentioned. Is there an obvious mistake in my logic I'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you are right. I missed the outer if (tmp3 != noreg) { which is Valhalla specific. As we discussed off-thread, it would be good to get a better understanding of why this fails in Valhalla. Thanks a lot!
| RegSet clobbered_gp_regs = MacroAssembler::call_clobbered_gp_registers(); | ||
| assert(clobbered_gp_regs.contains(tmp1), "tmp1 must be saved explicitly if it's not a clobber"); | ||
| assert(clobbered_gp_regs.contains(tmp2), "tmp2 must be saved explicitly if it's not a clobber"); | ||
| assert(clobbered_gp_regs.contains(r14), "r14 must be saved explicitly if it's not a clobber"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea with these asserts!
Let's proceed piece by piece.
G1BarrierSetAssembler::g1_write_barrier_preing1BarrierSetAssembler_aarch64.cppvalhalla/src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp
Lines 216 to 220 in 1077e4f
push_call_clobbered_registers/pop_call_clobbered_registersshould be enough around a runtime call, that's what clobbered registers are.But let's dig a bit, that will be useful later!
So, we save at least
call_clobbered_gp_registerswhich is defined as:valhalla/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
Lines 3614 to 3620 in 1077e4f
So, we save r0-r7 and r10-r17
CardTableBarrierSetAssembler::oop_store_atincardTableBarrierSetAssembler_aarch64.cppvalhalla/src/hotspot/cpu/aarch64/gc/shared/cardTableBarrierSetAssembler_aarch64.cpp
Lines 116 to 124 in 1077e4f
Digging the history looks like it still makes sense, it doesn't look like an accident. If we remove the if-branch, tests are totally on fire. Let's keep it.
CardTableBarrierSetAssembler::oop_store_atincardTableBarrierSetAssembler_x86.cppSame as before. But it's not even guarantee that
InlineTypePassFieldsAsArgsis true. We can have such a backtrace:where the last give
r8fortmp3. It is not quite clear to me why we don't have a problem in mainline, because it looks like corrupting address base register is a bad idea given that we use it after.MacroAssembler::unpack_inline_helperinmacroAssembler_aarch64.cppFirst part
valhalla/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
Lines 7163 to 7166 in 1077e4f
Yes, these registers are saved, as we saw above! I've added some asserts to make sure that they are still in the
call_clobbered_gp_registers()set. But it seems a bit redundant: that's what clobber are, registers that we cannot trust to be kept. If we use a register not in this set, the native callee shouldn't touch it. Better safe than sorry?Second Part
valhalla/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
Lines 7269 to 7271 in 1077e4f
This is fine.
Indeed, we might reset the stream just before, but then we exhaust it in the loop.
valhalla/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
Lines 7251 to 7262 in 1077e4f
This is already what we do before:
valhalla/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
Lines 7182 to 7183 in 1077e4f
(and this loop has no break).
So entering the if and running the second reset will just leave the stream in the same state.
About this piece, I also took the liberty of changing a tiny bit the ctor of
ScalarizedInlineArgsStreamas this class was not correct ifstepis not 1 or -1. We could also have fixed it by changing the line 79 invalhalla/src/hotspot/share/runtime/signature_cc.hpp
Lines 75 to 81 in 1077e4f
with
_depth = abs(_step), but that looks like a recipe for a less explicit constructor, and more obscure code for a feature that is not desirable (skipping elements).gen_c2i_adapterinsharedRuntime_aarch64.cppFirst part
valhalla/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp
Lines 623 to 633 in 1077e4f
After some digging, the comment about r13 doesn't seem correct to me. I've changed it.
The second part is again about trusting
push_call_clobbered_registersto push enough things. I've added some asserts again.Second part
valhalla/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp
Lines 645 to 647 in 1077e4f
Yes, but we save too much.
save_live_registerscallspush_CPU_state(_save_vectors, ...)defined there:valhalla/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
Lines 3649 to 3673 in 1077e4f
It will always save registers r0-r29, but also the floating point registers. The difference is that with
save_vectors, we save the whole vectors, while withoutsave_vectors, we save only 1 word (8 bytes) per float register (4 words for 4 registers). So, the question is whether we use more that the lower word of each register for arguments, and... I'm not sure. We can keep saving them as it consumes only a bit of stack space, but it runs the same amount of instructions as not saving them.But I said we save too much, that's because we will save v0-v31 (or pieces of) anyway, while
c_farg*andj_farg*are only the registers v0-v7. If I understand well, in this context (we are generating an adapter), around a call (for instance to allocate a buffer) we need to save only the registers that are relevant for the calling convention, like the ones that can hold arguments. If the register v25 (for instance) contained something useful for the caller, it should have been saved by the caller before doing the call, and so before entering the adapter. Since v25 is not used by the adapter or the calling convention, we don't need to save it. Is it worth refiningsave_live_registers?gen_c2i_adapterinsharedRuntime_x86_64.cppThe x86 version of the last thing:
valhalla/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
Lines 918 to 921 in 1077e4f
/*save_vectors*/is actuallysave_wide_vectors.save_live_registerscallspush_FPU_stateanyway which doesfxsave, which saves XMM0-XMM15. Ifsave_wide_vectors, thensave_live_registersmakes sure to save all of ZMM (the rest of ZMM0-ZMM15, as well as the others ZMM).c_farg*andj_farg*are defined as xmm0-xmm7. XMM registers are already 128 bits (that's 16 bytes), so if we are not using the rest of zmm registers for arguments, it should be fine. Unlike on Aarch64, saving settingsave_wide_vectorstotruewould create a lot more instructions: only onefxsaveis enough for the xmm0-15, but the rest must be saved a lot more manually, once piece at a time.While we probably are good with "only" 16 bytes saved, we need to save only what is used in the calling convention and in the adapter, so my guess would be that we actually only need to save xmm0-xmm7. I'm not sure it's worth improving
save_live_registerssince saving xmm0-xmm15 is done in a single instruction, we would trade a little of stack space for quite a lot more instructions. Maybe it's just fine to over-save.G1BarrierSetAssembler::g1_write_barrier_preing1BarrierSetAssembler_x86valhalla/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp
Lines 238 to 255 in 1077e4f
This branch was introduced in 4c3231c "Merge tag 'jdk-20+8' into lworld" on 2022-08-31.
The non-Valhalla side:
push_setfrom eb4849e (8283327: Add methods to save/restore registers when calling into the VM from C1/interpreter barrier code) on 2022-03-21pushfrom 2a0986b (8199417: Modularize interpreter GC barriers) on 2018-04-11 (which introduced the whole code)The Valhalla side:
j_farg*was added in 2455c8e (8251398: [lworld] TestCallingConvention::test36 spuriously fails due to incorrect field value) on 2020-08-11pushacomes from d9f3aaa (8242210: [lworld] TestCallingConvention::test36 spuriously fails) on 2020-04-09So it seems the merge jdk was done in a very conservative way, keeping both sides that evolved separately. Yet, the Valhalla side is probably overkill.
call_clobbered_xmm_registers()is all the xmm on all os, except on windows where it's xmm0-xmm5 + xmm16-xmm_max. That covers thec_farg*(since on windows it only goes from 0 to 3), but it doesn't cover thej_farg*(that are from 0 to 7, even on windows). But on Windows, only xmm0-xmm5 are considered volatile (clobbers). Since we are calling a native function, it should be fine (that iscall_clobbered_xmm_registers()and its comment are correct).Thanks,
Marc
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/1824/head:pull/1824$ git checkout pull/1824Update a local copy of the PR:
$ git checkout pull/1824$ git pull https://git.openjdk.org/valhalla.git pull/1824/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 1824View PR using the GUI difftool:
$ git pr show -t 1824Using diff file
Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/1824.diff
Using Webrev
Link to Webrev Comment