multithreadingkotlinconcurrencykotlin-coroutinescoroutine

Reputation: 5497

Difference between a thread and a coroutine in Kotlin

Is there a specific language implementation in Kotlin which differs from another language's implementation of coroutines?

What does it mean that a coroutine is like a lightweight thread?
What is the difference?
Are Kotlin coroutines actually running in parallel (concurrently)?
Even in a multi-core system, is there only one coroutine running at any given time?

Here I'm starting 100,000 coroutines. What happens behind this code?

for (i in 0..100000) {
    async(CommonPool) {
        // Run long-running operations
    }
}

Upvotes: 122

Answers (3)

Mahozad

Reputation: 24732

Here I provide my understanding of threads and coroutines as a Kotlin programmer.
Note that the new Java virtual threads is something similar to coroutines in Kotlin.

Thread

A thread is a sequence of programming language statements (assignments, calculations, ifs, fors, whiles, etc.) (each of which may correspond to one or more CPU instructions) that are executed by CPU one after the other. Statements in one thread CAN run concurrently in respect to statements in other threads (if there are any other threads).

In other words, you say to the OS that this block of code (bunch of statements) (which is like any other ordinary code, maybe tens or thousands of lines, creating objects, calling functions, those functions calling other functions, changing variables, having loops, etc.) can be executed concurrently in respect to code that is not in the block.

So, if we have threads t1 and t2 and t3, the operating system may execute one or more of t1 statements and then switch to execute statements in t2 where it left off last time and then switch to execute statements in t3 where it left off last time and then again switch to execute statements in t1 where it left off last time and this continues indefinitely until either there is no more statements in the threads or the process is terminated/killed.

If the CPU is multicore, then the OS may execute threads in parallel/simultaneously instead of concurrently (rapidly switching between them).

Coroutine

A coroutine is exactly like a thread in that it is a sequence of statements that run one after the other in the coroutine itself, but CAN run concurrently in respect to statements in other coroutines (if any).

But, coroutines are something that run on top of threads and instead of the operating system being in charge of switching between them, it is the programming language runtime that switches between coroutines. So, if the coroutine c1 is executing in thread t1, the runtime may switch to execute c2 and then c3 and so on while the OS has not yet even switched from t1 to t2.

Comparison and the difference

The operating system keeps track of each thread by a data structure called TCB.

The language runtime keeps track of each coroutine with a data structure called continuation.

Now, each thread in OS has its own stack in memory which takes several Megabytes but each coroutine continuation takes only several kilobytes. Compare the amount of memory needed for thousands of threads vs thousands of coroutines.

Coroutines also have other benefits.

They are non-blocking. It means that instead of being blocked and put aside by the OS, they suspend and free their underlying thread so the thread can execute other code (the mechanism that the coroutine uses to know whether what it is waiting for is ready so it can resume is probably by polling or by providing a callback or by OS interrupts).
Coroutines have structured concurrency (more control over their lifetime and cancellation).

I also think that switching between threads (which is done by OS) is more expensive than switching between coroutines.

Upvotes: 3

Roman Elizarov

Reputation: 28708

What does it mean that a coroutine is like a lightweight thread?

Coroutine, like a thread, represents a sequence of actions that are executed concurrently with other coroutines (threads).

What is the difference?

A thread is directly linked to the native thread in the corresponding OS (operating system) and consumes a considerable amount of resources. In particular, it consumes a lot of memory for its stack. That is why you cannot just create 100k threads. You are likely to run out of memory. Switching between threads involves OS kernel dispatcher and it is a pretty expensive operation in terms of CPU cycles consumed.

A coroutine, on the other hand, is purely a user-level language abstraction. It does not tie any native resources and, in the simplest case, uses just one relatively small object in the JVM heap. That is why it is easy to create 100k coroutines. Switching between coroutines does not involve OS kernel at all. It can be as cheap as invoking a regular function.

Are Kotlin coroutines actually running in parallel (concurrently)? Even in a multi-core system, is there only one coroutine running at any given time?

A coroutine can be either running or suspended. A suspended coroutine is not associated to any particular thread, but a running coroutine runs on some thread (using a thread is the only way to execute anything inside an OS process). Whether different coroutines all run on the same thread (a thus may use only a single CPU in a multicore system) or in different threads (and thus may use multiple CPUs) is purely in the hands of a programmer who is using coroutines.

In Kotlin, dispatching of coroutines is controlled via coroutine context. You can read more about then in the Guide to kotlinx.coroutines

Here I'm starting 100,000 coroutines. What happens behind this code?

Assuming that you are using launch function and CommonPool context from the kotlinx.coroutines project (which is open source) you can examine their source code here:

launch is defined here https://github.com/Kotlin/kotlinx.coroutines/blob/master/core/kotlinx-coroutines-core/src/main/kotlin/kotlinx/coroutines/experimental/Builders.kt
CommonPool is defined here https://github.com/Kotlin/kotlinx.coroutines/blob/master/core/kotlinx-coroutines-core/src/main/kotlin/kotlinx/coroutines/experimental/CommonPool.kt

The launch just creates new coroutine, while CommonPool dispatches coroutines to a ForkJoinPool.commonPool() which does use multiple threads and thus executes on multiple CPUs in this example.

The code that follows launch invocation in {...} is called a suspending lambda. What is it and how are suspending lambdas and functions implemented (compiled) as well as standard library functions and classes like startCoroutines, suspendCoroutine and CoroutineContext is explained in the corresponding Kotlin coroutines design document.

Upvotes: 117

Ruslan

Reputation: 14640

Since I used coroutines only on JVM, I will talk about the JVM backend. There are also Kotlin Native and Kotlin JavaScript, but these backends for Kotlin are out of my scope.

So let's start with comparing Kotlin coroutines to other languages coroutines. Basically, you should know that there are two types of coroutines: stackless and stackful. Kotlin implements stackless coroutines - it means that coroutine doesn't have its own stack, and that limiting a little bit what coroutine can do. You can read a good explanation here.

Examples:

Stackless: C#, Scala, Kotlin
Stackful: Quasar, Javaflow

What does it mean that a coroutine is like a lightweight thread?

It means that coroutine in Kotlin doesn't have its own stack, it doesn't map on a native thread, it doesn't require context switching on a processor.

What is the difference?

Thread - preemptively multitasking. (usually). Coroutine - cooperatively multitasking.

Thread - managed by OS (usually). Coroutine - managed by a user.

Are Kotlin coroutines actually running in parallel (concurrently)?

It depends. You can run each coroutine in its own thread, or you can run all coroutines in one thread or some fixed thread pool.

More about how coroutines execute is here.

Even in a multi-core system, is there only one coroutine running at any given time?

No, see the previous answer.

Here I'm starting 100,000 coroutines. What happens behind this code?

Actually, it depends. But assume that you write the following code:

fun main(args: Array<String>) {
    for (i in 0..100000) {
        async(CommonPool) {
            delay(1000)
        }
    }
}

This code executes instantly.

Because we need to wait for results from async call.

So let's fix this:

fun main(args: Array<String>) = runBlocking {
    for (i in 0..100000) {
        val job = async(CommonPool) {
            delay(1)
            println(i)
        }

        job.join()
    }
}

When you run this program, Kotlin will create 2 * 100000 instances of Continuation, which will take a few dozen MB of RAM, and in the console, you will see numbers from 1 to 100000.

So let’s rewrite this code in this way:

fun main(args: Array<String>) = runBlocking {

    val job = async(CommonPool) {
        for (i in 0..100000) {
            delay(1)
            println(i)
        }
    }

    job.join()
}

What do we achieve now? Now we create only 100,001 instances of Continuation, and this is much better.

Each created Continuation will be dispatched and executed on CommonPool (which is a static instance of ForkJoinPool).

Upvotes: 93

Difference between a thread and a coroutine in Kotlin

Answers (3)

Thread

Coroutine

Comparison and the difference

Related Questions