overexchange
overexchange

Reputation: 1

Why INSERT query is 3-4 times slower than SELECT?

Application talks to YugabyteDB instance with YCQL driver(Gocql). We have below two queries (for example):

SELECT col2,
       col3,
       col4,
       col5
FROM   table1
WHERE  primarykeycol = 11ab8b12 - a934 - 4f2e - 8a0d - e7eba3faa47f;

INSERT INTO table1
            (
                        primarykeycol,
                        col2,
                        col3,
                        col4,
                        col5
            )
            VALUES
            (
                        11ab8b12-a934-4f2e-8a0d-e7eba3faa47f,
                        222b8b12-a934-4f2e-8a0d-e7eba3faa47f,
                        'someString',
                        totimestamp(now()),
                        totimestamp(now())
            )
            IF NOT EXISTS returns status AS ROW;

From the last 7 days, data metrics from YugaWare, each SELECT query (for example shown above) is taking 10ms consistently whereas each INSERT query takes 30-40ms.


When running explain statement on SELECT query says:

QUERY PLAN

Primary Key look up on keySpaceName.table1
     Key Conditions: (primarykeyCol = 11ab8b12-a934-4f2e-8a0d-e7eba3faa47f)

Running explain statement on INSERT query says:

QUERY PLAN

Insert on keySpaceName.table1

The replication factor of the node cluster is 3.

Based on YugaWare metrics, Why INSERT query is 3-4 times slower than SELECT?

Upvotes: 3

Views: 455

Answers (2)

Bala S
Bala S

Reputation: 31

Adding to above comments, INSERT might be slower for your use case as it needs to load the indexes, e.g. if. you have table with Unique indexes. YBDB stores these indexes as tables and it needs to insert those as well and it will get distributed/replicated to other nodes as well.

Upvotes: 1

dh YB
dh YB

Reputation: 1097

If your RF > 1, insert operations go through a distributed transaction cycle whereby records are stored on a majority of replicas (if RF=3, then 2; if RF=4, then 3, it RF=7, then 5). However, in the case of a SELECT, the record is read from only one tablet (assuming you are retrieving 1 record).

Assuming you're selecting the same row over and over, it should be cached in memory and you should experience mostly the network latency. This will also depend on whether you're hitting the node that has the tablet-leader or not.

Upvotes: 1

Related Questions