Adam Tegen
Adam Tegen

Reputation: 25887

Efficient database searching for LIKE '%something%'

I'm trying to search through phone numbers for any phone number containing some series of digits.

Obviously the following is going to be slow:

 Select * from customer where phone like '%1234%'

I need the wildcards, because the users were allowed to enter any data in the database, so it might have country codes, a leading 1 (as in 1-800) or a trailing extension (which was sometimes just separated by a space.

Note: I've already created 'cleaned' phone numbers by removing all non-digit characters, so I don't have to worry about dashes, spaces, and the like.

Is there any magic to get a search like this to run in reasonable time?

Upvotes: 3

Views: 844

Answers (2)

Will Hartung
Will Hartung

Reputation: 118593

Nope.

You could make an index table if you like. It'll be a bit expensive, but perhaps it's worthwhile.

So you could turn a phone number: 2125551212 in to a zillion references based on unique substrings and build an inverted index from that:

1
2
5
12
21
25
51
55
121
125
212
255
512
551
555
1255
2125
2555
5121
5512
5551
12555
21255
25551
55121
55512
125551
212555
255512
555121
1255512
2125551
2555121
12555121
21255512
212555121
2125551212

So, for example:

create table myindex (
    key varchar(10) not null,
    datarowid integer not null references datarows(id)
);
create index i1myindex(key);
insert into myindex values('1255', datarow.id);

Depending on how deep you want to go.

For example, you could go only 4 deep, and then scan the results of those with the 4 numbers.

So, for example, if you have "%123456%", you can ask for keys with "1234" and then apply the full expression on the result set.

Like:

select d.* from datarows d, myindex i where i.datarowid = d.id and i.key = '1234' and d.phone like "%123456%";

The index should help you narrow down the lot quite quickly and the db will scan the remainder.

Obviously, you'll be generating some data here, but if you query a lot, you could some performance here.

Upvotes: 1

Joshua
Joshua

Reputation: 5494

If you're using MySQL, you're looking for the Full Text Search functionality http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html

It specifically optimizes queries like the one you have listed, and is pretty darn fast once set up. You'll need your data in MySQL, and it must be in a MyISAM table (not InnoDB or other.)

I've used it in production and it works pretty well.

Upvotes: 2

Related Questions