-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
What do we have now?
The following deadlocks because nested runBlocking
tries to process tasks from the outer runBlocking
, and one of the tasks actually depends on the completion of the nested runBlocking
:
import kotlinx.coroutines.*
class Container(private val scope: CoroutineScope) {
private val instances = HashMap<Class<*>, Deferred<Any>>() // basically a map
suspend fun <T> instance(key: Class<out T>): T {
return instances
.computeIfAbsent(key, ::instanceDeferred)
.await() as T
}
private fun instanceDeferred(clazz: Class<*>): Deferred<Any> = scope.async {
val constructor = clazz.getConstructor(Container::class.java)
constructor.isAccessible = true
constructor.newInstance(this@Container)
}
}
class MyService(container: Container) {
init {
runBlocking {
// this can ask container for an unrelated service,
// or in this case it's enough for it to be empty
}
}
}
class MyService2(container: Container) {
val service1 = runBlocking {
container.instance(MyService::class.java)
}
}
fun main() {
runBlocking {
val container = Container(this@runBlocking)
launch {
println(container.instance(MyService::class.java))
}
launch {
println(container.instance(MyService2::class.java).service1)
}
}
}
What should be instead?
runBlocking
should not steal tasks from outer runBlocking
.
Alternatively, please provide an API to make runBlocking
not steal tasks. It may be a boolean, or a special coroutine context element. Currently we have to use internals.
Why?
Because otherwise it deadlocks.
Notes
Originally, the "stealing" was added to fix a very specific use-case #860.
I believe that if nested runBlocking
"steals" tasks while being inside a limited dispatcher, then this stealing logic should apply to all limited dispatchers, i.e. runBlocking
on a thread inside Dispatchers.Default
should steal tasks from Dispatchers.Default
(#3439). If that's not possible for any reason, then nested runBlocking
should not steal tasks from the outer runBlocking
for the very same reason whatever it might be.
Activity
[-]Nested `runBlocking` deadlock when a task [/-][+]Nested `runBlocking` deadlock when a task depends on the nested `runBlocking` itself[/+][-]Nested `runBlocking` deadlock when a task depends on the nested `runBlocking` itself[/-][+]Nested `runBlocking` deadlocks when a task depends on the nested `runBlocking` itself[/+]runBlocking
should let go of CPU token before parking the thread #3983dkhalanskyjb commentedon Dec 12, 2023
A simplified reproducer:
dkhalanskyjb commentedon Dec 12, 2023
Right now, the code I show above hangs.
If we disable work stealing and let the thread do its thing inside
runBlocking
, we encounter the opposite problem: the following code will hang.Right now, this code passes just fine: the
runBlocking
on the bottom gets entered, runs the other tasks to completion, joins thejob
and exits. If it couldn't steal tasks, then therunBlocking
on the bottom would just wait forever until someone completed thejob
.In fact, in your case, just swapping the
launch
blocks at the bottom is enough to "fix" the problem, but this "fix" would stop working if we disabled work stealing.What I'm getting from this is, with
runBlocking
, you can't avoid topologically sorting your code by the time an innerrunBlocking
is entered using the "depends-on" relation: whether work stealing is enabled or not simply reverses the direction of the dependency arrows.dovchinnikov commentedon Dec 12, 2023
Basically, yes.
dovchinnikov commentedon Dec 12, 2023
The important thing here is a conceptual problem: should
runBlocking
steal tasks from a limited dispatcher, a thread of which is blocked by the saidrunBlocking
? I believe that ifrunBlocking
steals from outerrunBlocking
, thenrunBlocking
should steal fromDispatchers.Default
if it happens on a thread which holds the CPU-token, and, in general,runBlocking
should steal from any limited dispatcher.ninja- commentedon May 12, 2024
I did hit this edge case in one of my projects and it was difficult to debug based on just thread and coroutine dumps.
Some coroutines were just stuck in CREATED state forever.
Task A is a run-once where results are derived from a mix of synchronous and asynchronous code, including runBlocking code.
Tasks B, C, D synchronously await task A promise using runBlocking.
Run this with a dispatcher using 4 threads and you have a deadlock.
On a larger scale, deadlock gets random which makes it hard to debug.
Unfortunately, the main reason runBlocking is needed is to provide asynchronous resources within class initializers which are not suspending.
Solution for me was to remove thread limit, then there is always a new thread available to unpause the coroutine and for other threads to unblock as a result.
I then also slowed down adding new tasks to the system to not overwhelm the unlimited thread pool.
Alternatively, the design for task- stealing executor is quite simple and maybe will be in Kotlin some day?
1 Create a dispatcher class which adds callback objects to a queue
2 In every runBlocking call, start a background task that consumes callbacks from the shared queue
runBlocking
executes other tasks instead of its own #4215runBlocking
as@DelicateCoroutinesApi
#4242