user1342178
user1342178

Reputation: 123

Is it possible to ORDER BY similar strings in MySQL?

I have a column containing text input from various users around the world and I would like to group them so that strings that are similar to each other are closer together. The column contains strings that are in languages other than English and in addition can contain misspelled words.

I have been looking into N-grams and Levenshtein distance but those seem to require that I have a string in mind to match the rows against.

As an example, I would like

   |Comcast               | 
   |how to play basketball|
   |Walmart               |
   |www.Comcast.net       |
   |Wamlart               |
   |basketball            |

to end up looking like:

   |Comcast               | 
   |www.Comcast.net       |
   |how to play basketball|
   |basketball            |
   |Walmart               |
   |Wamlart               |

Any help would be appreciated. Thank you.

Upvotes: 0

Views: 911

Answers (3)

RolandoMySQLDBA
RolandoMySQLDBA

Reputation: 44343

If all the strings have things to remove like '.net' before displaying the data, you could pull off something like this:

SELECT column,ordercolumn FROM 
(
    SELECT column,REPLACE(column,'www.comcast.net','ComCast') ordercolumn
    FROM mytable
) A
ORDER BY ordercolumn,column;

Upvotes: 0

Marcx
Marcx

Reputation: 6826

Use ORDER BY instead of GROUP BY

SELECT * FROM table ORDER BY column ASC
SELECT * FROM table ORDER BY column DESC

The column will be ordered alphabetically ascending or descending

Upvotes: 0

D'Arcy Rittich
D'Arcy Rittich

Reputation: 171371

You could try ordering by SOUNDEX code.

E.g.,

ORDER BY SOUNDEX(MyColumn)

This will break down on longer strings, however. You will probably have better luck using the Double Metaphone algorithm:

http://www.atomodo.com/code/double-metaphone/metaphone.sql/view

Upvotes: 1

Related Questions