Reputation: 471
Is there a query to get a list of records that have similar names? For example, I have a fruits table as below:
Id | Name
1 | Banana
2 | Banana 2
3 | Banana
4 | Apple
5 | Apple
DB::table('fruits')
->select('name', 'id', DB::raw('COUNT(*) as count'))
->groupBy('name')
->havingRaw('COUNT(*) > 1')
->get();
This will give me 2 groups: Banana and Apple
For the Apple group, it'll have id 4 and 5
For the Banana group, it'll have id 1 and 3 but 2 isn't included.
My question is, how do I get 2 to be included as well?
Basically a query to get duplicate records that have similar names.
Upvotes: 0
Views: 367
Reputation: 1706
Ok, so based on your feedback and questions, I came up with an mysql query like so:
select a.id, a.name, count(distinct b.id)
from (select name, id from fruits group by name) a
join fruits b on b.name like concat(a.name, '%')
group by a.name
By cross joining a uniquified version of the table onto itself, matching b.name to a.name+'%', you get a relatively good result.
I tested this with a table in my db which contained substrings of other records and the result is 100% covering. This shows results that have no match as well(count = 1).
You could run it like this to only show results that have a match(count > 1):
select a.id, a.name, count(distinct b.id)+count(distinct a.id)
from (select name, id from fruits group by name) a
join fruits b on a.id <> b.id and b.name like concat(a.name, '%')
group by a.name
AFAIK, there is no nice way to convert this query to eloquent. So IMO the best way to perform the query is to do:
DB::select("
select a.id, a.name, count(distinct b.id)+count(distinct a.id)
from (select name, id from fruits group by name) a
join fruits b on a.id <> b.id and b.name like concat(a.name, '%')
group by a.name
");
Upvotes: 1