Skip to content

Conversation

@loganharbour
Copy link
Member

Closes #32012

@loganharbour loganharbour marked this pull request as ready for review December 1, 2025 21:36
@loganharbour
Copy link
Member Author

@lindsayad when you have a moment, please

@moosebuild
Copy link
Contributor

moosebuild commented Dec 2, 2025

Job Documentation, step Docs: sync website on 349606b wanted to post the following:

View the site here

This comment will be updated on new commits.

@loganharbour loganharbour marked this pull request as draft December 2, 2025 15:43
@moosebuild
Copy link
Contributor

Job Test, step Results summary on 349606b wanted to post the following:

Framework test summary

Compared against 82bb60c in job civet.inl.gov/job/3419953.

No change

Modules test summary

Compared against 82bb60c in job civet.inl.gov/job/3419953.

No change

@lindsayad
Copy link
Member

what exactly is the race condition? libMesh::out and libMesh::err should be thread safe these days after libMesh/libmesh#4237

@loganharbour
Copy link
Member Author

what exactly is the race condition? libMesh::out and libMesh::err should be thread safe these days after libMesh/libmesh#4237

When it is printing out what to be done, it relies on the state of the buffer (whether or not something else printed). If something else prints to the buffer while it is running, it'll end up clobbering the state of where the status printout is

@lindsayad
Copy link
Member

lindsayad commented Dec 2, 2025

I don't understand yet. After the libMesh PR, each thread has its own buffer. Things like std::flush force sync'd output of the thread local buffers to the global owning buffer

@lindsayad
Copy link
Member

When it is printing out what to be done, it relies on the state of the buffer (whether or not something else printed)

If you could point to this line(s), that would probably help me understand

@loganharbour
Copy link
Member Author

When it is printing out what to be done, it relies on the state of the buffer (whether or not something else printed)

If you could point to this line(s), that would probably help me understand

Unfortunately it's not that simple.

As the perf graph live thread wakes up - it records the state of the console (who printed to it last). How it prints things (like if it is printing a dot because the last thing was a section starting) depends on whether or not it was the last thing to print to the console or not. If something else is printed to console while the print thread wakes up and before it does its print, you get a garbled mess and the print thread could print in the middle of someone elses output.

@lindsayad
Copy link
Member

These changes just feel heavy. It feels disappointing to be introducing a moose_console_mutex in Moose.h after the work we've done at the libMesh level.

If something else is printed to console while the print thread wakes up and before it does its print, you get a garbled mess and the print thread could print in the middle of someone elses output.

Garbled as in dots intermingling with other thread's output (which I agree is a problem)? The output should still be atomic, I hope, such that you would not get a race condition of the variety that TSAN reports.

Thinking about this makes me wonder why we have a separate thread printing dots anyway since we logically want the output to be sequential

@lindsayad
Copy link
Member

Unless people are intentionally printing console output from code within Threads::parallel_reduce but we would expect that to be garbled on its own. I just wonder if in serial code regions (outside of Threads::paralle_reduce) there should just be a single thread writing to the _console. This single thread would be notified when a task has finished so it could stop printing dots

@loganharbour
Copy link
Member Author

Thinking about this makes me wonder why we have a separate thread printing dots anyway since we logically want the output to be sequential

I'm not aware of any way to do this without a thread.

@lindsayad
Copy link
Member

Do you have an example that (usually) demonstrates the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add more thread safety to PerfGraphLive

3 participants