user284331
user284331

Reputation: 213

Is UpsertAsync in Couchbase really asynchronous?

I was trying to upsert some documents into Couchbase by using C#. Here is my (very simple) code.

var cluster = await Cluster.ConnectAsync("127.0.0.1", "Admin", "*****");
var bucket = await cluster.BucketAsync("Embedding");
var scope = await bucket.ScopeAsync("Testing");
var collection = await scope.CollectionAsync("Testing");

int i = 0;

Stopwatch stopwatch = Stopwatch.StartNew();

while (i < 10000)
{
    string Id = "my-document-" + i;

    var input = new { Name = "Ted", Number = i };

    collection.UpsertAsync(Id,input);

    i++;
}

stopwatch.Stop();

Console.WriteLine($"Complete in {stopwatch.ElapsedMilliseconds / 1000.00} seconds");

So it simply upserts 10000 documents of this simple form. I cannot believe that it only performs around 150 documents per second. My teammate said that this is totally unacceptable, he wrote a JAVA code by using the official Couchbase SDK, and it took only 300ms in upserting 10000 documents.

I am running Couchbase by Docker, and I pulled it by

docker run -d --name db -p 8091-8096:8091-8096 -p 11210-11211:11210-11211 couchbase

and this is suggested by the Couchbase official website, I am not pulling from some weird source.

So I begin to suspect it was my laptop hardware problem. I am using i5-8250U CPU along with 12.0 GB ram. I think this is way too sufficient for upserting those simple documents.

By the way, my teammate is using Apple M2 16 GB.

A more unexplainable thing is that, I did change to another laptop with a more high end hardware (i5-1135G7, 32.0 GB ram), it took 6 seconds to complete those 10000 documents. What a coincidence is that, when my teammate removed the async version in his JAVA code, it took 4 seconds to complete that, so I am now suspecting the UpsertAsync in C# may not behave like an asynchronous task.

By the way, I notice that there is no non-async version of upsert in C#.

In sum there are two questions:

  1. Though my old laptop (i5-8250U, 12.0 GB ram) differs than the new one (i5-1135G7, 32.0 GB ram), so upserting documents speed may differ like 150/1s and 10000/6s?

  2. Is UpsertAsync in C# really async? Its speed behaves very similar to the non-async Upsert in JAVA just like what my teammate showed me.

The JAVA code by my teammate:

package com.ecom.demo.couchbase;

import com.couchbase.client.java.*;
import com.couchbase.client.java.kv.*;
import com.couchbase.client.java.json.*;
import com.couchbase.client.java.query.*;
import reactor.core.publisher.Mono;

import java.time.Duration;
import java.util.List;
import java.util.stream.IntStream;

public class StartUsing {
   instance and credentials.
    static String connectionString = "couchbase://192.168.11.11";
    static String username = "Administrator";
    static String password = "*****";
    static String bucketName = "*****";

    public static void main(String... args) throws InterruptedException {
        Cluster cluster = Cluster.connect(
                connectionString,
                ClusterOptions.clusterOptions(username, password).environment(env -> {
                })
        );

        Bucket bucket = cluster.bucket(bucketName);
        bucket.waitUntilReady(Duration.ofSeconds(10));

        Scope scope = bucket.scope("test");
        Collection collection = scope.collection("data");
        ReactiveCollection reactiveCollection = collection.reactive();

        System.out.println("connected");

        int count = 10_000;

        sync(count, collection);

        //async(count, reactiveCollection);
    }

    static void sync(int count, Collection collection) {

        long start = System.currentTimeMillis();

        for (int i=1; i<=10000; i++) {
            var id = "my-document-" + i;
            var doc = JsonObject.create()
                    .put("name", "mike")
                    .put("age", i);

            collection.upsert(id, doc);
        }

        long end = System.currentTimeMillis();

        System.out.println(end - start);
    }

    static void async(int count, ReactiveCollection reactiveCollection) {

        long start = System.currentTimeMillis();

        List<Mono<MutationResult>> upsertTasks = IntStream.rangeClosed(1, 100_000)
                .mapToObj(i -> {
                    String id = "my-document-" + i;
                    JsonObject doc = JsonObject.create()
                            .put("name", "mike")
                            .put("age", i);
                    return reactiveCollection.upsert(id, doc)
                            .doOnError(error -> System.err.println("Upsert failed: " + error));
                })
                .toList(); 

        Mono.when(upsertTasks)
                .doOnTerminate(() -> {
                    long end = System.currentTimeMillis(); 
                    System.out.println("Total time (ms): " + (end - start));
                })
                .block(); 

        long end = System.currentTimeMillis();

        System.out.println(end - start);
    }
}

Speed test on my newer laptop (32 GB one), by using stopwatch just as in the C# code:

1 document = 0.022 seconds,
10 documents = 0.03 seconds,
100 documents = 0.074 seconds,
1000 documents = 0.412 seconds,
10000 documents = 5.713 seconds,
100000 documents = 55.072 seconds

Below I am testing on my 32 GB laptop with JAVA:

package com.ecom.demo.couchbase;

import com.couchbase.client.java.*;
import com.couchbase.client.java.kv.*;
import com.couchbase.client.java.json.*;
import com.couchbase.client.java.query.*;
import reactor.core.publisher.Mono;

import java.time.Duration;
import java.util.List;
import java.util.stream.IntStream;

public class StartUsing {

    static String connectionString = "127.0.0.1";
    static String username = "Administrator";
    static String password = "*****";
    static String bucketName = "Embedding";

    public static void main(String... args) throws InterruptedException {
        Cluster cluster = Cluster.connect(
                connectionString,
                ClusterOptions.clusterOptions(username, password).environment(env -> {
                })
        );

        Bucket bucket = cluster.bucket(bucketName);
        bucket.waitUntilReady(Duration.ofSeconds(10));

        // get a user-defined collection reference
        Scope scope = bucket.scope("Testing");
        Collection collection = scope.collection("Testing");
        ReactiveCollection reactiveCollection = collection.reactive();

        System.out.println("connected");

        int count = 10_000;

        sync(count, collection);

        //async(count, reactiveCollection);

    }

    static void sync(int count, Collection collection) {

        long start = System.currentTimeMillis();

        for (int i=1; i<=count; i++) {
            var id = "my-document-" + i;
            var doc = JsonObject.create()
                    .put("Name", "Ted")
                    .put("Number", i);

            // sync
            collection.upsert(id, doc);
        }

        long end = System.currentTimeMillis();

        System.out.println(end - start);
    }

    static void async(int count, ReactiveCollection reactiveCollection) {

        long start = System.currentTimeMillis();

        List<Mono<MutationResult>> upsertTasks = IntStream.rangeClosed(1, count)
                .mapToObj(i -> {
                    String id = "my-document-" + i;
                    JsonObject doc = JsonObject.create()
                            .put("Name", "Ted")
                            .put("Number", i);
                    return reactiveCollection.upsert(id, doc)
                            .doOnError(error -> System.err.println("Upsert failed: " + error));
                })
                .toList();

        Mono.when(upsertTasks)
                .doOnTerminate(() -> {
                    long end = System.currentTimeMillis();
                    System.out.println("Total time (ms): " + (end - start));
                })
                .block();

        long end = System.currentTimeMillis();

        System.out.println(end - start);
    }
}

For 10000 documents, the sync took 3.4 seconds, and the async took 1.62 seconds.

While the C# one (I modified the code a little bit, the Guid is removed, as suggested below by mjwills), non-await async took 5.4 seconds, and the await+async took 8.58 seconds.

As indicated by Amos, I also tried that

Stopwatch stopwatch = Stopwatch.StartNew();

int i = 0;

var tasks = new Task[10000];

while (i < 10000)
{
    string Id = "my-document-" + i;

    var input = new { Name = "Ted", Number = i };

    tasks[i] = collection.UpsertAsync(Id, input);

    i++;
}

Task.WaitAll(tasks);

stopwatch.Stop();

it took 6.55 seconds.

Upvotes: 1

Views: 155

Answers (1)

mn_test347
mn_test347

Reputation: 452

upsertAsync is indeed asynchronous.

https://github.com/couchbase/couchbase-net-client

Upvotes: 0

Related Questions