-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Describe the bug
A guy promises to an old lady that he'll fix a broken staircase, but then arrives to the house and notices that it's in complete disrepair. He calls his old friend to come and redo the circuitry. He doesn't touch the instruments: "My friend will need them, and after he's done, then I'll fix the staircase." The friend arrives and notices that the house is in complete disrepair. He calls another guy to come and fix a door that's hanging on one hinge. Meanwhile, he doesn't touch the instruments: "Let my friend fix the door first, and then I'll do my job". Repeat this situation enough times, and you get hundreds of people in one room, with no one doing anything. Repeat this more, and the house blows up because of how many people are squeezed in there.
These guys were runBlocking
. The house is the stack space, and the set of tools is the current thread. The old lady is an unfortunate user of our code.
Reported in https://youtrack.jetbrains.com/issue/KT-66219.
Related to #3982: if we fix this, the reproducer for #3982 will no longer work, though the general issue will likely still be there. EDIT: yes, it will.
Provide a Reproducer
runBlocking {
repeat(1000) {
launch {
try {
runBlocking {
// do nothing
}
} catch (e: Throwable) {
println(e)
}
}
}
}
prints
java.lang.NoClassDefFoundError: Could not initialize class kotlin.internal.PlatformImplementationsKt
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
Activity
dkhalanskyjb commentedon Aug 14, 2024
(Excerpts from our design meeting with
@qwwdfsad
and@zuevmaxim
)One solution to this is to have
runBlocking
prioritize their own tasks. This can be done by eachrunBlocking
adding its ownCoroutineContext
element to the coroutines launched inside it, recognizing them by that element, and giving them priority.The problem is that
runBlocking
should perform work-stealing anyway to improve liveness, and it's not clear what strategy to use. We could takeDispatchers.Default
's scheduler and repurpose it so that eachrunBlocking
gets its own queue with work-stealing support, but that's complicated. A deterministic but seemingly arbitrary choice that supports liveness is to traverse the stack of allrunBlocking
s running on the current thread (let's number them from 0, where 0 is the currently runningrunBlocking
) and simulate a streetlight pattern: check own queue — if empty, checkrunBlocking #1
— if empty, checkrunBlocking #2
— and so on; but next time, check 0 — then 2 — then 3, and so on, and then 1; next time, check 0 — then 3, then 4, and so on, then 1, then 2. This way, we avoid a situation whererunBlocking
0 and 1 communicate with one another without a chance for deeperrunBlocking
to do anything.In any case, prioritizing own work has a downside: the following code will stop working:
The inner
runBlocking
will spin without givingexit = true
a chance to run.In summary, if this issue is actually bothering someone, we know how to fix it, but if it doesn't, we'd prefer not to change
runBlocking
's ordering needlessly.runBlocking
as@DelicateCoroutinesApi
#4242dovchinnikov commentedon Nov 22, 2024
I still think this is the main problem.
runBlocking
is the only coroutine library primitive that does something out of scope.pablichjenkov commentedon Nov 22, 2024
Great analogy 👌
ninja- commentedon Dec 29, 2024
👀