Skip to content

Nested runBlocking deadlocks when a task depends on the nested runBlocking itself #3982

@dovchinnikov

Description

@dovchinnikov
Contributor

What do we have now?

The following deadlocks because nested runBlocking tries to process tasks from the outer runBlocking, and one of the tasks actually depends on the completion of the nested runBlocking:

import kotlinx.coroutines.*

class Container(private val scope: CoroutineScope) {

  private val instances = HashMap<Class<*>, Deferred<Any>>() // basically a map

  suspend fun <T> instance(key: Class<out T>): T {
    return instances
      .computeIfAbsent(key, ::instanceDeferred)
      .await() as T
  }

  private fun instanceDeferred(clazz: Class<*>): Deferred<Any> = scope.async {
    val constructor = clazz.getConstructor(Container::class.java)
    constructor.isAccessible = true
    constructor.newInstance(this@Container)
  }
}

class MyService(container: Container) {
  init {
    runBlocking {
      // this can ask container for an unrelated service,
      // or in this case it's enough for it to be empty
    }
  }
}

class MyService2(container: Container) {
  val service1 = runBlocking {
    container.instance(MyService::class.java)
  }
}

fun main() {
  runBlocking {
    val container = Container(this@runBlocking)
    launch {
      println(container.instance(MyService::class.java))
    }
    launch {
      println(container.instance(MyService2::class.java).service1)
    }
  }
}

What should be instead?

runBlocking should not steal tasks from outer runBlocking.

Alternatively, please provide an API to make runBlocking not steal tasks. It may be a boolean, or a special coroutine context element. Currently we have to use internals.

Why?

Because otherwise it deadlocks.

Notes

Originally, the "stealing" was added to fix a very specific use-case #860.

I believe that if nested runBlocking "steals" tasks while being inside a limited dispatcher, then this stealing logic should apply to all limited dispatchers, i.e. runBlocking on a thread inside Dispatchers.Default should steal tasks from Dispatchers.Default (#3439). If that's not possible for any reason, then nested runBlocking should not steal tasks from the outer runBlocking for the very same reason whatever it might be.

Activity

changed the title [-]Nested `runBlocking` deadlock when a task [/-] [+]Nested `runBlocking` deadlock when a task depends on the nested `runBlocking` itself[/+] on Dec 11, 2023
changed the title [-]Nested `runBlocking` deadlock when a task depends on the nested `runBlocking` itself[/-] [+]Nested `runBlocking` deadlocks when a task depends on the nested `runBlocking` itself[/+] on Dec 11, 2023
dkhalanskyjb

dkhalanskyjb commented on Dec 12, 2023

@dkhalanskyjb
Contributor

A simplified reproducer:

fun main() {
    runBlocking {
        val job = launch {
            runBlocking {
            }
        }
        launch {
            runBlocking {
                job.join()
            }
        }
    }
}
dkhalanskyjb

dkhalanskyjb commented on Dec 12, 2023

@dkhalanskyjb
Contributor

Right now, the code I show above hangs.

If we disable work stealing and let the thread do its thing inside runBlocking, we encounter the opposite problem: the following code will hang.

fun main() {
    runBlocking {
        val job = launch {
            yield() // new thing
            runBlocking {
            }
        }
        launch {
            runBlocking {
                job.join()
            }
        }
    }
}

Right now, this code passes just fine: the runBlocking on the bottom gets entered, runs the other tasks to completion, joins the job and exits. If it couldn't steal tasks, then the runBlocking on the bottom would just wait forever until someone completed the job.

In fact, in your case, just swapping the launch blocks at the bottom is enough to "fix" the problem, but this "fix" would stop working if we disabled work stealing.

What I'm getting from this is, with runBlocking, you can't avoid topologically sorting your code by the time an inner runBlocking is entered using the "depends-on" relation: whether work stealing is enabled or not simply reverses the direction of the dependency arrows.

dovchinnikov

dovchinnikov commented on Dec 12, 2023

@dovchinnikov
ContributorAuthor

Basically, yes.

dovchinnikov

dovchinnikov commented on Dec 12, 2023

@dovchinnikov
ContributorAuthor

The important thing here is a conceptual problem: should runBlocking steal tasks from a limited dispatcher, a thread of which is blocked by the said runBlocking? I believe that if runBlocking steals from outer runBlocking, then runBlocking should steal from Dispatchers.Default if it happens on a thread which holds the CPU-token, and, in general, runBlocking should steal from any limited dispatcher.

ninja-

ninja- commented on May 12, 2024

@ninja-

I did hit this edge case in one of my projects and it was difficult to debug based on just thread and coroutine dumps.

Some coroutines were just stuck in CREATED state forever.
Task A is a run-once where results are derived from a mix of synchronous and asynchronous code, including runBlocking code.
Tasks B, C, D synchronously await task A promise using runBlocking.

Run this with a dispatcher using 4 threads and you have a deadlock.
On a larger scale, deadlock gets random which makes it hard to debug.

Unfortunately, the main reason runBlocking is needed is to provide asynchronous resources within class initializers which are not suspending.

Solution for me was to remove thread limit, then there is always a new thread available to unpause the coroutine and for other threads to unblock as a result.
I then also slowed down adding new tasks to the system to not overwhelm the unlimited thread pool.

Alternatively, the design for task- stealing executor is quite simple and maybe will be in Kotlin some day?
1 Create a dispatcher class which adds callback objects to a queue
2 In every runBlocking call, start a background task that consumes callbacks from the shared queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @dovchinnikov@ninja-@dkhalanskyjb

        Issue actions

          Nested `runBlocking` deadlocks when a task depends on the nested `runBlocking` itself · Issue #3982 · Kotlin/kotlinx.coroutines