Reputation: 2612
The database contains only 2 tables:
CockroachDB 19.2.6 runs on 3 Ubuntu machines
vs
SQL Server 2019 runs on 1 machine Windows Server 2019
Here is the request
select * from transaction t
join wallet s on t.sender_id=s.id
join wallet r on t.receiver_id=r.id
limit 10;
1) I know that the infrastructure is not fair enough for CockroachDB but though.. the different is really too big. Am I missing something? or CockroachDB is just very slow for this particular SQL request?
2) When I execute this request, the CPU of all 3 cockroach nodes went up to 100%. Is it normal?
Update: here is the request "EXPLAIN". I'm not sure how to read it..
> explain select * from transaction t
-> join wallet s on t.sender_id=s.id
-> join wallet r on t.receiver_id=r.id
-> limit 10;
tree | field | description
+---------------------+--------------------+----------------------+
| distributed | true
| vectorized | false
limit | |
│ | count | 10
└── hash-join | |
│ | type | inner
│ | equality | (receiver_id) = (id)
│ | right cols are key |
├── hash-join | |
│ │ | type | inner
│ │ | equality | (sender_id) = (id)
│ │ | right cols are key |
│ ├── scan | |
│ │ | table | transaction@primary
│ │ | spans | ALL
│ └── scan | |
│ | table | wallet@primary
│ | spans | ALL
└── scan | |
| table | wallet@primary
| spans | ALL
Upvotes: 3
Views: 908
Reputation: 484
It appears that this is actually due to a difference in query plans between SQL Server and CockroachDB, but one that can be worked around in a few ways.
The root problem is that the transaction
table has two foreign key constraints pointed at the wallet
table, but both foreign keys are nullable. This prevents CockroachDB from pushing the 10 row limit through the join, because the scan on the transaction
table may need to produce more than 10 rows for the entire query to produce up to 10 rows.
We see this in the query plan:
> explain select * from transaction t
join wallet s on t.sender_id=s.id
join wallet r on t.receiver_id=r.id
limit 10;
info
---------------------------------------------
distribution: full
vectorized: true
• limit
│ count: 10
│
└── • lookup join
│ table: wallet@primary
│ equality: (receiver_id) = (id)
│ equality cols are key
│
└── • lookup join
│ table: wallet@primary
│ equality: (sender_id) = (id)
│ equality cols are key
│
└── • scan
estimated row count: 10,000
table: transaction@primary
spans: FULL SCAN
Notice that the limit is applied after both joins.
There are two relatively straightforward ways to fix this. First, we could replace the joins
with left joins
. This will allow the limit to be pushed down to the scan on the transaction
table because the left join will never discard rows.
> explain select * from transaction t
left join wallet s on t.sender_id=s.id
left join wallet r on t.receiver_id=r.id
limit 10;
info
----------------------------------------
distribution: full
vectorized: true
• lookup join (left outer)
│ table: wallet@primary
│ equality: (receiver_id) = (id)
│ equality cols are key
│
└── • lookup join (left outer)
│ table: wallet@primary
│ equality: (sender_id) = (id)
│ equality cols are key
│
└── • scan
estimated row count: 10
table: transaction@primary
spans: LIMITED SCAN
limit: 10
The other option is to make the referencing columns in the foreign key constraints non null
. This will also allow the limit to be pushed down to the scan on the transaction
table because even an inner join will never discard rows.
> alter table transaction alter column sender_id set not null;
ALTER TABLE
> alter table transaction alter column receiver_id set not null;
ALTER TABLE
> explain select * from transaction t
join wallet s on t.sender_id=s.id
join wallet r on t.receiver_id=r.id
limit 10;
info
----------------------------------------
distribution: full
vectorized: true
• lookup join
│ table: wallet@primary
│ equality: (receiver_id) = (id)
│ equality cols are key
│
└── • lookup join
│ table: wallet@primary
│ equality: (sender_id) = (id)
│ equality cols are key
│
└── • scan
estimated row count: 10
table: transaction@primary
spans: LIMITED SCAN
limit: 10
Upvotes: 3
Reputation: 583
I believe it's possible your table has outdated statistics. (Though hopefully by this point, the statistics have automatically been updated and you aren't having this problem anymore.) You can read about this in the CockroachDB docs, and there is also a link there that describes how you can manually create new statistics.
Upvotes: 0