Reputation: 39
I have a table (T1) with ca. 500000 records:
ID1 Relation ID2
4 Rel41 3
5 Rel21 7
13 Rel2 4
What is the efficient way to retrieve the records that the value 4 exist in:
The result should be:
4 Rel41 3
13 Rel2 4
I tried multiple-column index as follows:
create table T1 (
ID1 varchar(5),
Relation varchar(50),
ID2 varchar(5),
PRIMARY KEY (ID1,ID2)
);
and then I used select statement as follows:
select * from T1 WHERE ID1=4 OR ID2=4 ;
This was even slower than using no multiple-column index.
which type of index should be used here?
Upvotes: 1
Views: 51
Reputation: 48357
The database can only use an index which matches the 'anded' predicates which make up a contiguous prefix (or all) of an index. So it cannot use an index on id1, id2 to resolve a predicate on id2.
this is exposed in the explain plan for the query
To get the optimal response for your query, you would need seperate indices on d1 and d2. But even then, the optimizer may struggle unless you convert the query to a union:
Select *
From T1
Where id1=4
Union
Select *
From T1
Where id2=4;
If you have cases where both predicates are matched, then you'll get duplicate rows - and removing them after resolving the records has an additional cost, hence in the absence of other considerations for your index design or data distribution, the optimal solution might be an index on id1 and an index on (id2, id1) along with...
Select *
From T1
Where id1=4
Union
Select *
From T1
Where id2=4
And id1<>4;
Note that the mismatch predicate (id1<>4) might not be resolved before retrieving the row; this would require some experimentation.
If I were looking at building this query, I would also be checking the plan for your original query, and also...
select * from T1 WHERE 4 IN (ID1, ID2)
Upvotes: 0
Reputation: 142296
Do not compare a VARCHAR
to a number. It cannot use any index. If you are storing numbers, use a numeric datatype. If you are storing strings of numbers (eg, zipcodes), quote the value you are comparing to.
Is the pair (id1, id2) unique? If not, it should not be the PRIMARY KEY
.
Assuming you have PRIMARY KEY(id1, id2)
, then add on INDEX(id2)
. This new index is needed for both solutions below.
Plan A: Hope that "Index merge" kicks in. Run EXPLAIN SELECT ...
to see if it does.
Plan B: This works regardless, but will be a little slower than Index merge. I call this "turning OR
into UNION
".
( SELECT * from T1 WHERE ID1=4 )
UNION DISTINCT
( SELECT * FROM T1 WHERE ID2=4 )
That can be sped up slightly if you do not expect dups: UNION ALL
.
Upvotes: 1
Reputation: 3738
I'd create the table this way:
CREATE TABLE `T1` (
`ID1` int(5) NOT NULL,
`Relation` varchar(50) DEFAULT NULL,
`ID2` int(5) NOT NULL,
KEY `ID1` (`ID1`),
KEY `ID2` (`ID2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This creates two separate indexes and should improve the performance. I also changed the data type to integer.
Upvotes: 0
Reputation: 311478
Since the two columns are unrelated, and you can search by one or the other, I'd just create two separate indexes, one for each column:
CREATE INDEX t1_id1_idx ON t1(id1);
CREATE INDEX t1_id2_idx ON t1(id2);
Upvotes: 0