riorio
riorio

Reputation: 6826

Couchbase Get operation slows down when the number of incoming threads increases

Summary:

We have a major performance issue with Spring-Boot 2.0.4 and Couchbase server 5.5.1

We are experiencing a rapid decline in DB response time performance when the number of threads is increasing. Here is another report about the issue.

In Details:

Spring Boot is running with 500 threads:

server:
  tomcat:
    max-threads: 500
    max-connections: 500

We are using the following dependency:

     <dependency>
        <groupId>org.springframework.data</groupId>
        <artifactId>spring-data-couchbase</artifactId>
        <version>3.0.9.RELEASE</version>
    </dependency>

Our "select" from DB is performed with Spring-Data repository:

Cat findFirstByOwnerIdAndNameAndColor(String ownerId, String name, String color);

We have an index that is especially for this query:

CREATE INDEX `cat_by_ownerId_name_and_color_idx` ON `pets`(`ownerId`,`name`,`color`) WHERE (`_class` = "com.example.Cat")

As the number of requests increase, we can see a quick degradation in the time it takes the DB to answer the query.

For example, when running 300 requests per second, the 99's percentile of response time is about 10 Seconds!! and the 50's percentile is about 5 seconds.

The average size of the returned document is about 300 Bytes. Meaning that we are trying to extract about 90 Kilobytes per second. A relatively low amount.

Edit:

I'm adding here the result of running the same query in the UI of Couchbase: (In the UI, the query takes 1.75ms to complete).

{
 "plan": {
  "#operator": "Sequence",
  "~children": [
  {
    "#operator": "IndexScan3",
    "index": "cats_by_ownerId_name_and_color_idx",
    "index_id": "c061141c2d373067",
    "index_projection": {
      "primary_key": true
    },
    "keyspace": "pets",
    "namespace": "default",
    "spans": [
      {
        "exact": true,
        "range": [
          {
            "high": "\"bf23fa4c-22c3-42ac-b141-39cdc76bb2x5\"",
            "inclusion": 3,
            "low": "\"bf23fa4c-22c3-42ac-b141-39cdc76bb2x5\""
          },
          {
            "high": "\"Oscar\"",
            "inclusion": 3,
            "low": "\"Oscar\""
          },
          {
            "high": "\"red\"",
            "inclusion": 3,
            "low": "\"red\""
          }
        ]
      }
    ],
    "using": "gsi"
  },
  {
    "#operator": "Fetch",
    "keyspace": "pets",
    "namespace": "default"
  },
  {
    "#operator": "Parallel",
    "~child": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Filter",
          "condition": "(((((`pets`.`_class`) = \"com.example.Cat\") and ((`pets`.`ownerId`) = \"bf23fa4c-22c3-42ac-b141-39cdc76bb2x5\")) and ((`pets`.`name`) = \"Oscar\")) and ((`pets`.`color`) = \"red\"))"
        },
        {
          "#operator": "InitialProject",
          "result_terms": [
            {
              "expr": "self",
              "star": true
            }
          ]
        },
        {
          "#operator": "FinalProject"
        }
      ]
    }
  }
]
 },
 "text": "select * from pets where _class=\"com.example.Cat\" and projectId=\"bf23fa4c-22c3-42ac-b141-39cdc76bb2x5\" and name=\"Oscar\" and color=\"red\""

}

EDIT 2

We also tried to implicitly write the N1ql query, but the outcome is the same. As before, we get many TimeOutExceptions:

   Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.dao.QueryTimeoutException: java.util.concurrent.TimeoutException: {"b":"pets","s":"n1ql","t":7500000,"i":"f8cdf670-d32a-4d74-858c-f9dd9789d264"}; nested exception is java.lang.RuntimeException: java.util.concurrent.TimeoutException: {"b":"pets","s":"n1ql","t":7500000,"i":"f8cdf670-d32a-4d74-858c-f9dd9789d264"}] with root cause

java.util.concurrent.TimeoutException: {"b":"pets","s":"n1ql","t":7500000,"i":"f8cdf670-d32a-4d74-858c-f9dd9789d264"}
   at com.couchbase.client.java.bucket.api.Utils$1.call(Utils.java:131) ~[java-client-2.7.0.jar:na]
   at com.couchbase.client.java.bucket.api.Utils$1.call(Utils.java:127) ~[java-client-2.7.0.jar:na]
   at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:140) ~[rxjava-1.3.8.jar:1.3.8]
   at rx.internal.operators.OnSubscribeTimeoutTimedWithFallback$TimeoutMainSubscriber.onTimeout(OnSubscribeTimeoutTimedWithFallback.java:166) ~[rxjava-1.3.8.jar:1.3.8]
   at rx.internal.operators.OnSubscribeTimeoutTimedWithFallback$TimeoutMainSubscriber$TimeoutTask.call(OnSubscribeTimeoutTimedWithFallback.java:191) ~[rxjava-1.3.8.jar:1.3.8]
   at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) ~[rxjava-1.3.8.jar:1.3.8]
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_161]
   at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_161]
   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_161]
   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_161]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_161]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_161]
   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]

Is there a way to fix this, or we need a different DB?

Upvotes: 1

Views: 1685

Answers (1)

riorio
riorio

Reputation: 6826

So after further investigation, the problem was found in the Spring-Data component.

To over come it, we had to move to non-blocking mechanism.

We did 2 things:

  • All the calls from controller layer down to service & repository layers, were changed to CompleteableFuture<Cat>
  • To bypass Spring-Data connection to the couchbase, we created a repository class of our own with implementation code that looks something like that:

    Statement statement = select("*")
            .from(i(bucket.name()))
            .where(x("name").eq(s(name))
                    .and(x("ownerId").eq(s(ownerId)))
                    .and(x("color").eq(s(color)))
                    .and(x("_class").eq(s("com.example.Cat"))));
    
    CompletableFuture<Cat> completableFuture = new CompletableFuture();
    bucket.async().query(statement)
    ...
    

    After we did that, the latency problem disappeared and the performance are about 2 Milliseconds for query, even during about few hundreds concurrent requests.

Upvotes: 1

Related Questions