Skip to content

8364111: InstanceMirrorKlass iterators should handle CDS and hidden classes consistently #26477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

shipilev
Copy link
Member

@shipilev shipilev commented Jul 25, 2025

See the bug for more investigation. I think the root cause is already fixed in one of the InstanceMirrorKlass iterators, but not in the other one. Additionally, only one iterator handles the hidden classes. Also, there is a stale comment about CMS that apparently prevents us from asserting is_primitive, which is not relevant anymore.

So this PR commons out the metadata handling on both iterators, which fixes the GenShen+CDS bug, likely fixes hidden classes bug somewhere, and prevents the omission like this from happening again.

Additional testing:

  • Linux x86_64 server fastdebug, Generational Shenandoah + CDS bugs are not reproducing now
  • Linux x86_64 server fastdebug, tier1
  • Linux x86_64 server fastdebug, all

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8364111: InstanceMirrorKlass iterators should handle CDS and hidden classes consistently (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26477/head:pull/26477
$ git checkout pull/26477

Update a local copy of the PR:
$ git checkout pull/26477
$ git pull https://git.openjdk.org/jdk.git pull/26477/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 26477

View PR using the GUI difftool:
$ git pr show -t 26477

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26477.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 25, 2025

👋 Welcome back shade! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 25, 2025

@shipilev This change is no longer ready for integration - check the PR body for details.

@openjdk
Copy link

openjdk bot commented Jul 25, 2025

@shipilev The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@shipilev shipilev marked this pull request as ready for review July 25, 2025 12:55
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 25, 2025
@mlbridge
Copy link

mlbridge bot commented Jul 25, 2025

Webrevs

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good cleanup but need someone who knows GC better to also review.

/reviewers add 2

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 25, 2025
@openjdk
Copy link

openjdk bot commented Jul 25, 2025

@coleenp Usage: /reviewers <n> [<role>] where <n> is the number of required reviewers. If role is set, the reviewers need to have that project role. If omitted, role defaults to authors.

@coleenp
Copy link
Contributor

coleenp commented Jul 25, 2025

/reviewers 2

@openjdk
Copy link

openjdk bot commented Jul 25, 2025

@coleenp
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jul 25, 2025
} else {
// Klass is null means this has been a mirror for a primitive type
// that we do not need to follow as they are always strong roots.
assert(java_lang_Class::is_primitive(obj), "Sanity");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question, since this assert is new:

When we create a mirror (java_lang_Class::allocate_mirror) during normal - non-shared - class loading, is there not a small time window where the mirror exists on the heap, but the Klass is still NULL? Could concurrent heap walkers not observe that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We create the mirror then initialize the class, so a concurrent GC could see a null Klass pointer.

Copy link
Member Author

@shipilev shipilev Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I now see this assert is tautological: java_lang_Class::is_primitive checks that java_class->metadata_field(_klass_offset) == nullptr, which is the same as pulling the Klass* with java_lang_Class::as_Klass and null-checking it, like we already just did. So really the assert is redundant and we can purge it.

But this also suggests that whatever interaction with GC exists, it affects current code as well. I think it works with concurrent GCs through a different mechanism: the classes loaded during the GC and their mirrors would be either treated as implicitly live (e.g. via SATB invariants), or need to be specially keep-alived (Shenandoah's IU mode did this, IIRC).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what old CMS comment tried to say, I think:

// However, we might get across a klass that just changed during CMS concurrent
// marking if allocation occurred in the old generation.
//
// This is benign here, as we keep alive all CLDs that were loaded during the
// CMS concurrent phase in the class loading, i.e. they will be iterated over
// and kept alive during remark.

Copy link
Member Author

@shipilev shipilev Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty exhausted at this point, I'll mull over it after the weekend. But so far I think we are fine. Testing returns clean as well. If you can schedule some aggressive testing over the weekend, it would be helpful.

Copy link
Member Author

@shipilev shipilev Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, it's actually the other way around: we can see nullptr it our explicit check for a non-primitive class, stall, get to the assert, then discover that mirror is installed, and falsely fail the assert. Assert needs to be removed, did so in new commit. This also matches current behavior better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe what we have now is correct. We cannot do anything here if Klass* backlink is not yet installed, and GC code needs to handle it separately. I think all GCs already do by e.g. keeping newly allocated classes alive. I polished the comment a bit.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new comment is a lot better, thanks.

@@ -121,11 +124,7 @@ void InstanceMirrorKlass::oop_oop_iterate_bounded(oop obj, OopClosureType* closu

if (Devirtualizer::do_metadata(closure)) {
if (mr.contains(obj)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, we could probably move mr.contains up to include InstanceKlass::oop_oop_iterate_bounded<T>(obj, closure, mr); to prevent the double check.

@tstuefe
Copy link
Member

tstuefe commented Jul 29, 2025

I wondered briefly whether the allowance ("CLD = null is okay for newly-loaded classes") should be also made in

} else {
assert(AOTLinkedClassBulkLoader::is_pending_aot_linked_class(k), "sanity");
}

But since we now handle mirror->referee Klass correctly in IMK, this only path affects oop->its Klass, and no instances exist yet for that Klass, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot [email protected] rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

3 participants