Skip to content

Conversation

legendecas
Copy link
Member

@legendecas legendecas commented Aug 26, 2025

When calling process.exit() or on uncaught exceptions as soon as the
process starts, the process will try to terminate immediately. In this
case, there could be a race condition on the unfinished off-thread
system CA loader which tries to access the OpenSSL API which has been
de-inited on the main thread.

Example stacks:

Thread 0:: MainThread Dispatch queue: com.apple.main-thread
0   libsystem_malloc.dylib        	       0x191568570 small_malloc_from_free_list + 384
1   libsystem_malloc.dylib        	       0x191567dc8 small_malloc_should_clear + 176
2   libsystem_malloc.dylib        	       0x191567c0c szone_malloc_should_clear + 120
3   node                          	       0x103e8b038 CRYPTO_malloc + 84 (mem.c:211) [inlined]
4   node                          	       0x103e8b038 CRYPTO_aligned_alloc + 128 (mem.c:277)
5   node                          	       0x103e79584 alloc_new_neighborhood_list + 16 (hashtable.c:157) [inlined]
6   node                          	       0x103e79584 ossl_ht_flush_internal + 84 (hashtable.c:292)
7   node                          	       0x103e79668 ossl_ht_free + 40 (hashtable.c:325)
8   node                          	       0x103e872b8 ossl_namemap_free + 32 (core_namemap.c:554) [inlined]
9   node                          	       0x103e872b8 ossl_stored_namemap_free + 60 (core_namemap.c:72)
10  node                          	       0x103e86c30 context_deinit_objs + 172 (context.c:295)
11  node                          	       0x103e85f0c context_deinit + 24 (context.c:376) [inlined]
12  node                          	       0x103e85f0c ossl_lib_ctx_default_deinit + 52 (context.c:415)
13  node                          	       0x103e89ba0 OPENSSL_cleanup + 196 (init.c:464)
14  libsystem_c.dylib             	       0x19161f964 __cxa_finalize_ranges + 512
15  libsystem_c.dylib             	       0x19161f704 exit + 44
16  node                          	       0x102cd3540 node::Exit(node::ExitCode) + 12 (environment.cc:988)
17  node                          	       0x102cd3590 node::DefaultProcessExitHandlerInternal(node::Environment*, node::ExitCode) + 80 (environment.cc:1007)
18  node                          	       0x102d4a268 std::__1::__function::__value_func<void (node::Environment*, node::ExitCode)>::operator()[abi:nn190102](node::Environment*&&, node::ExitCode&&) const + 28 (function.h:430) [inlined]
19  node                          	       0x102d4a268 std::__1::function<void (node::Environment*, node::ExitCode)>::operator()(node::Environment*, node::ExitCode) const + 28 (function.h:989) [inlined]
20  node                          	       0x102d4a268 node::Environment::Exit(node::ExitCode) + 400 (env.cc:1879)
21  node                          	       0x102dab73c node::errors::TriggerUncaughtException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>, bool) + 488
22  node                          	       0x103103c48 v8::internal::MessageHandler::ReportMessageNoExceptions(v8::internal::Isolate*, v8::internal::MessageLocation const*, v8::internal::DirectHandle<v8::internal::Object>, v8::Local<v8::Value>) + 300 (messages.cc:179)
23  node                          	       0x103103ac4 v8::internal::MessageHandler::ReportMessage(v8::internal::Isolate*, v8::internal::MessageLocation const*, v8::internal::DirectHandle<v8::internal::JSMessageObject>) + 640 (messages.cc:146)
24  node                          	       0x1030f2d7c v8::internal::Isolate::ReportPendingMessages(bool) + 528 (isolate.cc:3136)
25  node                          	       0x1030dca54 v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) + 1312
26  node                          	       0x1030dc50c v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::DirectHandle<v8::internal::Object>, v8::internal::DirectHandle<v8::internal::Object>, v8::base::Vector<v8::internal::DirectHandle<v8::internal::Object> const>) + 120 (execution.cc:530)
27  node                          	       0x102f60e50 v8::Function::Call(v8::Isolate*, v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 472 (api.cc:5433)
28  node                          	       0x102d8a2d4 node::builtins::BuiltinLoader::CompileAndCall(v8::Local<v8::Context>, char const*, int, v8::Local<v8::Value>*, node::Realm*) + 60 (node_builtins.cc:510) [inlined]
29  node                          	       0x102d8a2d4 node::builtins::BuiltinLoader::CompileAndCall(v8::Local<v8::Context>, char const*, node::Realm*) + 276
30  node                          	       0x102e2eac0 node::Realm::ExecuteBootstrapper(char const*) + 76 (node_realm.cc:181)
31  node                          	       0x102d6ed70 node::StartExecution(node::Environment*, char const*) + 52 (node.cc:254)
32  node                          	       0x102d6ed14 node::StartExecution(node::Environment*, std::__1::function<v8::MaybeLocal<v8::Value> (node::StartExecutionCallbackInfo const&)>) + 1756
33  node                          	       0x102cd1e28 node::LoadEnvironment(node::Environment*, std::__1::function<v8::MaybeLocal<v8::Value> (node::StartExecutionCallbackInfo const&)>, std::__1::function<void (node::Environment*, v8::Local<v8::Value>, v8::Local<v8::Value>)>) + 368 (environment.cc:570)
34  node                          	       0x102de821c node::NodeMainInstance::Run(node::ExitCode*, node::Environment*) + 48 (node_main_instance.cc:106) [inlined]
35  node                          	       0x102de821c node::NodeMainInstance::Run() + 176 (node_main_instance.cc:99)
36  node                          	       0x102d72da4 node::StartInternal(int, char**) + 232 (node.cc:1578) [inlined]
37  node                          	       0x102d72da4 node::Start(int, char**) + 728 (node.cc:1585)
38  dyld                          	       0x1913c6b98 start + 6076

Thread 1 Crashed:
0   libsystem_pthread.dylib       	       0x191764690 pthread_rwlock_wrlock + 0
1   node                          	       0x103e993b0 CRYPTO_THREAD_write_lock + 12 (threads_pthread.c:642)
2   node                          	       0x103e34480 OSSL_DECODER_CTX_new_for_pkey + 1796 (decoder_pkey.c:921)
3   node                          	       0x103f36878 x509_pubkey_ex_d2i_ex + 612 (x_pubkey.c:208)
4   node                          	       0x103da3358 asn1_template_noexp_d2i + 188 (tasn_dec.c:682)
5   node                          	       0x103da2140 asn1_item_embed_d2i + 1504 (tasn_dec.c:422)
6   node                          	       0x103da3358 asn1_template_noexp_d2i + 188 (tasn_dec.c:682)
7   node                          	       0x103da2140 asn1_item_embed_d2i + 1504 (tasn_dec.c:422)
8   node                          	       0x103da1ae0 asn1_item_ex_d2i_intern + 40 (tasn_dec.c:118) [inlined]
9   node                          	       0x103da1ae0 ASN1_item_d2i_ex + 60 (tasn_dec.c:144) [inlined]
10  node                          	       0x103da1ae0 ASN1_item_d2i + 76 (tasn_dec.c:154)
11  node                          	       0x102f10710 node::crypto::ReadMacOSKeychainCertificates(std::__1::vector<x509_st*, std::__1::allocator<x509_st*>>*) + 372 (crypto_context.cc:527)
12  node                          	       0x104c6e57c node::crypto::GetSystemStoreCACertificates() + 20 [inlined]
13  node                          	       0x104c6e57c node::crypto::LoadSystemCACertificates(void*) (.cold.1) + 32 (crypto_context.cc:818)
14  node                          	       0x102f10d60 node::crypto::GetSystemStoreCACertificates() + 4 (crypto_context.cc:811) [inlined]
15  node                          	       0x102f10d60 node::crypto::LoadSystemCACertificates(void*) + 52 (crypto_context.cc:818)
16  libsystem_pthread.dylib       	       0x191767c0c _pthread_start + 136
17  libsystem_pthread.dylib       	       0x191762b80 thread_start + 8

Refs: #59550

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Aug 26, 2025
@nodejs-github-bot
Copy link
Collaborator

When calling `process.exit()` or on uncaught exceptions as soon as the
process starts, the process will try to terminate immediately. In this
case, there could be a race condition on the unfinished off-thread
system CA loader which tries to access the OpenSSL API which has been
de-inited on the main thread.
@legendecas legendecas force-pushed the system-ca-off-thread branch from 3747069 to f02900e Compare August 26, 2025 10:29
Copy link
Member

@joyeecheung joyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM though I have some comments, can be done as follow up though

@@ -1004,6 +1007,11 @@ void DefaultProcessExitHandlerInternal(Environment* env, ExitCode exit_code) {
// in node_v8_platform-inl.h
uv_library_shutdown();
DisposePlatform();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that in the normal exit path, this is done separately and guarded by kNoInitializeNodeV8Platform

node/src/node.cc

Lines 1312 to 1321 in ca76b39

if (!(flags & ProcessInitializationFlags::kNoInitializeNodeV8Platform)) {
V8::DisposePlatform();
// uv_run cannot be called from the time before the beforeExit callback
// runs until the program exits unless the event loop has any referenced
// handles after beforeExit terminates. This prevents unrefed timers
// that happen to terminate during shutdown from being run unsafely.
// Since uv_run cannot be called, uv_async handles held by the platform
// will never be fully cleaned up.
per_process::v8_platform.Dispose();
}

Maybe we should wrap the things that need to be done both on normal and abnormal exit into a helper and call them in both DefaultProcessExitHandlerInternal and TearDownOncePerProcess?

(Also, from what I can tell, this function would only get called on the main thread - it would be better to assert it, most things in this section can only be called once by one thread)

Copy link

codecov bot commented Aug 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.83%. Comparing base (886e4b3) to head (f02900e).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #59632      +/-   ##
==========================================
- Coverage   89.85%   89.83%   -0.02%     
==========================================
  Files         667      667              
  Lines      196260   196261       +1     
  Branches    38559    38562       +3     
==========================================
- Hits       176341   176319      -22     
- Misses      12368    12376       +8     
- Partials     7551     7566      +15     
Files with missing lines Coverage Δ
src/api/environment.cc 77.12% <100.00%> (+0.21%) ⬆️

... and 27 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@legendecas legendecas added the request-ci Add this label to start a Jenkins CI on a PR. label Aug 26, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Aug 26, 2025
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@legendecas legendecas added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Aug 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants