davidjhp
davidjhp

Reputation: 8016

In MySQL how to write SQL to search for words in a field?

create table tbl (
  id int,
  comment varchar(255),
  primary key (id)
);

insert into tbl (id, comment) values ('1', 'dumb,');
insert into tbl (id, comment) values ('2', 'duuumb,');
insert into tbl (id, comment) values ('3', 'dummb');
insert into tbl (id, comment) values ('4', 'duummb');
insert into tbl (id, comment) values ('5', 'very dumb person');

select comment, soundex(comment) 
from tbl;

Result:

+------------------+------------------+
| comment          | soundex(comment) |
+------------------+------------------+
| dumb,            | D510             |
| duuumb,          | D510             |
| dummb            | D510             |
| duummb           | D510             |
| very dumb person | V6351625         |
+------------------+------------------+

I want to find all rows containing 'dumb', including all typos and variations, anywhere in the field.

select comment 
from tbl
where soundex(comment) like '%D510%'

This fails to get the final row #5, how can I also get that row? If there is a better solution than soundex() that would be fine.

Upvotes: 2

Views: 87

Answers (3)

Steve
Steve

Reputation: 388

Assuming you want the whole field and not just the matching element then this should work as soundex will never find an element in the middle of an other element;

select comment from tbl where (soundex(comment) like '%D510%' or comment like '%d%mb')

Edit Changed like for cases where the U is something else ie Damb which gives a soundex of D510

Upvotes: 0

user4018366
user4018366

Reputation:

Can you try with MySQL REGEXP? Is a good solution to find a specific word into text.

You can use [[:<:]] and [[:>:]] as word boundaries:

SELECT comment FROM tbl WHERE comment REGEXP '[[:<:]]dumb[[:>:]]'

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269803

This will work for your particular example:

select comment 
from tbl
where soundex(comment) like '%D510%' or comment like '%dumb%';

It won't find misspellings in the comment.

EDIT:

You could do something like this:

select comment
from tbl
where soundex(comment) = soundex('dumb') or
      soundex(substring_index(substring_index(comment, ' ', 2), -1)  = soundex('dumb') or
      soundex(substring_index(substring_index(comment, ' ', 3), -1)  = soundex('dumb') or
      soundex(substring_index(substring_index(comment, ' ', 4), -1)  = soundex('dumb') or
      soundex(substring_index(substring_index(comment, ' ', 5), -1)  = soundex('dumb');

A bit brute force.

The need to do this suggests that you should consider a full text index.

Upvotes: 4

Related Questions