Reputation: 213
I was trying to upsert some documents into Couchbase by using C#. Here is my (very simple) code.
var cluster = await Cluster.ConnectAsync("127.0.0.1", "Admin", "*****");
var bucket = await cluster.BucketAsync("Embedding");
var scope = await bucket.ScopeAsync("Testing");
var collection = await scope.CollectionAsync("Testing");
int i = 0;
Stopwatch stopwatch = Stopwatch.StartNew();
while (i < 10000)
{
string Id = "my-document-" + i;
var input = new { Name = "Ted", Number = i };
collection.UpsertAsync(Id,input);
i++;
}
stopwatch.Stop();
Console.WriteLine($"Complete in {stopwatch.ElapsedMilliseconds / 1000.00} seconds");
So it simply upserts 10000 documents of this simple form. I cannot believe that it only performs around 150 documents per second. My teammate said that this is totally unacceptable, he wrote a JAVA code by using the official Couchbase SDK, and it took only 300ms in upserting 10000 documents.
I am running Couchbase by Docker, and I pulled it by
docker run -d --name db -p 8091-8096:8091-8096 -p 11210-11211:11210-11211 couchbase
and this is suggested by the Couchbase official website, I am not pulling from some weird source.
So I begin to suspect it was my laptop hardware problem. I am using i5-8250U CPU along with 12.0 GB ram. I think this is way too sufficient for upserting those simple documents.
By the way, my teammate is using Apple M2 16 GB.
A more unexplainable thing is that, I did change to another laptop with a more high end hardware (i5-1135G7, 32.0 GB ram), it took 6 seconds to complete those 10000 documents. What a coincidence is that, when my teammate removed the async version in his JAVA code, it took 4 seconds to complete that, so I am now suspecting the UpsertAsync in C# may not behave like an asynchronous task.
By the way, I notice that there is no non-async version of upsert in C#.
In sum there are two questions:
Though my old laptop (i5-8250U, 12.0 GB ram) differs than the new one (i5-1135G7, 32.0 GB ram), so upserting documents speed may differ like 150/1s and 10000/6s?
Is UpsertAsync in C# really async? Its speed behaves very similar to the non-async Upsert in JAVA just like what my teammate showed me.
The JAVA code by my teammate:
package com.ecom.demo.couchbase;
import com.couchbase.client.java.*;
import com.couchbase.client.java.kv.*;
import com.couchbase.client.java.json.*;
import com.couchbase.client.java.query.*;
import reactor.core.publisher.Mono;
import java.time.Duration;
import java.util.List;
import java.util.stream.IntStream;
public class StartUsing {
instance and credentials.
static String connectionString = "couchbase://192.168.11.11";
static String username = "Administrator";
static String password = "*****";
static String bucketName = "*****";
public static void main(String... args) throws InterruptedException {
Cluster cluster = Cluster.connect(
connectionString,
ClusterOptions.clusterOptions(username, password).environment(env -> {
})
);
Bucket bucket = cluster.bucket(bucketName);
bucket.waitUntilReady(Duration.ofSeconds(10));
Scope scope = bucket.scope("test");
Collection collection = scope.collection("data");
ReactiveCollection reactiveCollection = collection.reactive();
System.out.println("connected");
int count = 10_000;
sync(count, collection);
//async(count, reactiveCollection);
}
static void sync(int count, Collection collection) {
long start = System.currentTimeMillis();
for (int i=1; i<=10000; i++) {
var id = "my-document-" + i;
var doc = JsonObject.create()
.put("name", "mike")
.put("age", i);
collection.upsert(id, doc);
}
long end = System.currentTimeMillis();
System.out.println(end - start);
}
static void async(int count, ReactiveCollection reactiveCollection) {
long start = System.currentTimeMillis();
List<Mono<MutationResult>> upsertTasks = IntStream.rangeClosed(1, 100_000)
.mapToObj(i -> {
String id = "my-document-" + i;
JsonObject doc = JsonObject.create()
.put("name", "mike")
.put("age", i);
return reactiveCollection.upsert(id, doc)
.doOnError(error -> System.err.println("Upsert failed: " + error));
})
.toList();
Mono.when(upsertTasks)
.doOnTerminate(() -> {
long end = System.currentTimeMillis();
System.out.println("Total time (ms): " + (end - start));
})
.block();
long end = System.currentTimeMillis();
System.out.println(end - start);
}
}
Speed test on my newer laptop (32 GB one), by using stopwatch just as in the C# code:
1 document = 0.022 seconds,
10 documents = 0.03 seconds,
100 documents = 0.074 seconds,
1000 documents = 0.412 seconds,
10000 documents = 5.713 seconds,
100000 documents = 55.072 seconds
Below I am testing on my 32 GB laptop with JAVA:
package com.ecom.demo.couchbase;
import com.couchbase.client.java.*;
import com.couchbase.client.java.kv.*;
import com.couchbase.client.java.json.*;
import com.couchbase.client.java.query.*;
import reactor.core.publisher.Mono;
import java.time.Duration;
import java.util.List;
import java.util.stream.IntStream;
public class StartUsing {
static String connectionString = "127.0.0.1";
static String username = "Administrator";
static String password = "*****";
static String bucketName = "Embedding";
public static void main(String... args) throws InterruptedException {
Cluster cluster = Cluster.connect(
connectionString,
ClusterOptions.clusterOptions(username, password).environment(env -> {
})
);
Bucket bucket = cluster.bucket(bucketName);
bucket.waitUntilReady(Duration.ofSeconds(10));
// get a user-defined collection reference
Scope scope = bucket.scope("Testing");
Collection collection = scope.collection("Testing");
ReactiveCollection reactiveCollection = collection.reactive();
System.out.println("connected");
int count = 10_000;
sync(count, collection);
//async(count, reactiveCollection);
}
static void sync(int count, Collection collection) {
long start = System.currentTimeMillis();
for (int i=1; i<=count; i++) {
var id = "my-document-" + i;
var doc = JsonObject.create()
.put("Name", "Ted")
.put("Number", i);
// sync
collection.upsert(id, doc);
}
long end = System.currentTimeMillis();
System.out.println(end - start);
}
static void async(int count, ReactiveCollection reactiveCollection) {
long start = System.currentTimeMillis();
List<Mono<MutationResult>> upsertTasks = IntStream.rangeClosed(1, count)
.mapToObj(i -> {
String id = "my-document-" + i;
JsonObject doc = JsonObject.create()
.put("Name", "Ted")
.put("Number", i);
return reactiveCollection.upsert(id, doc)
.doOnError(error -> System.err.println("Upsert failed: " + error));
})
.toList();
Mono.when(upsertTasks)
.doOnTerminate(() -> {
long end = System.currentTimeMillis();
System.out.println("Total time (ms): " + (end - start));
})
.block();
long end = System.currentTimeMillis();
System.out.println(end - start);
}
}
For 10000 documents, the sync took 3.4 seconds, and the async took 1.62 seconds.
While the C# one (I modified the code a little bit, the Guid is removed, as suggested below by mjwills), non-await async took 5.4 seconds, and the await+async took 8.58 seconds.
As indicated by Amos, I also tried that
Stopwatch stopwatch = Stopwatch.StartNew();
int i = 0;
var tasks = new Task[10000];
while (i < 10000)
{
string Id = "my-document-" + i;
var input = new { Name = "Ted", Number = i };
tasks[i] = collection.UpsertAsync(Id, input);
i++;
}
Task.WaitAll(tasks);
stopwatch.Stop();
it took 6.55 seconds.
Upvotes: 1
Views: 155
Reputation: 452
upsertAsync is indeed asynchronous.
https://github.com/couchbase/couchbase-net-client
Upvotes: 0