Reputation: 7582
Suppose we have a big mysql table, which has less than 10 million rows.
If I want to select all the results, obviously a full table scan works fine.
select * from table_name;
But how to make it in parallel? I found the solution in Sqoop is Split
.
select * from table_name where id >= 1 and id < 10000;
select * from table_name where id >= 10000 and id < 20000;
select * from table_name where id >= 20000 and id < 30000;
...
The problem is if the id size number
big, mysql may take it as a Full table scan
.
Update1: Slow query log
# Query_time: 600.632844 Lock_time: 0.000071 Rows_sent: 624 Rows_examined: 236584
SELECT `id`, ... FROM `table_name` WHERE ( `id` >= 647121 ) AND ( `id` <= 765101 );
Update2: Explain
+----+-------------+------------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | table_name | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where |
Update3: Mysql Version
+------------+
| version() |
+------------+
| 5.1.46-log |
+------------+
And if we send 3 querys like that at the same time, the server's query time is so long and we could not bear.
So, is it possible to do split query in parallel? If not, why Sqoop do like this? thx.
Upvotes: 0
Views: 529
Reputation: 6879
It looks like it is not using any keys. Are you using a rather old MySQL version?
The EXPLAIN
should be similar to this:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE table_name range PRIMARY PRIMARY 4 NULL 5926 Using index condition
Upvotes: 0