Reputation: 57680
Question are,
1.How can I improve the performance of SELECT
queries in mysql utilizing REGEXP
?
The table looks like
create table `tweets`(
`id` bigint auto_increment,
`tweet` varchar(140),
`time` datetime,
primary key(`id`)
);
Here the following query takes about 0.35 seconds.
select tweet from tweets where tweet regexp '^[abcdef]{1,4}$';
tweet
make it faster? If so, what type of index should I use?InnoDB
, Is there any other table engine that will become beneficial? Upvotes: 0
Views: 1682
Reputation: 4259
If the search you're looking for is at the start of a string, you can use LIKE as a high-level filter then check again with REGEXP
:
select tweet from tweets
where
(
tweet LIKE 'a%' OR
tweet LIKE 'b%' OR
tweet LIKE 'c%' OR
tweet LIKE 'd%' OR
tweet LIKE 'e%'
)
AND LENGTH(tweet) <= 4 -- try taking this line out line too
AND tweet regexp '^[abcdef]{1,4}$';
In spite of being a little convoluted, this should be a lot faster.
Upvotes: 1
Reputation:
Your best bet is to reduce the result set to evaluate against the regular expression before evaluating. Regular expressions are, for all intents and purposes, impossible to index for.
If I had to come up with a way for this, I would examine patterns that are commonly searched against, and mark them in some indexible way at insert time. For example if you use the ^[abcdef]{1,4}$
expression to search against a lot, I'd make a boolean column first4AThruF
and on an insert/ update trigger, update the column to true or false based on whether or not it matched the regular expression. If I indexed the first4AThruF
column, and the column had enough selectivity, I could write the query:
select tweet from tweets where first4AThruF = true;
and this should be pretty zippy.
Other possibilities to consider are full-text queries or LIKE clauses, although in the case mentioned above I don't expect them to work well.
Upvotes: 3