stalwart1014
stalwart1014

Reputation: 471

Get records that have duplicate similar names as a group

Is there a query to get a list of records that have similar names? For example, I have a fruits table as below:

Id | Name 
1  | Banana
2  | Banana 2
3  | Banana
4  | Apple 
5  | Apple

DB::table('fruits')
        ->select('name', 'id', DB::raw('COUNT(*) as count'))
        ->groupBy('name')
        ->havingRaw('COUNT(*) > 1')
        ->get();

This will give me 2 groups: Banana and Apple

For the Apple group, it'll have id 4 and 5

For the Banana group, it'll have id 1 and 3 but 2 isn't included.

My question is, how do I get 2 to be included as well?

Basically a query to get duplicate records that have similar names.

Upvotes: 0

Views: 367

Answers (1)

Techno
Techno

Reputation: 1706

Ok, so based on your feedback and questions, I came up with an mysql query like so:

select a.id, a.name, count(distinct b.id)
from (select name, id from fruits group by name) a
join fruits b on b.name like concat(a.name, '%')
group by a.name

By cross joining a uniquified version of the table onto itself, matching b.name to a.name+'%', you get a relatively good result.

I tested this with a table in my db which contained substrings of other records and the result is 100% covering. This shows results that have no match as well(count = 1).

You could run it like this to only show results that have a match(count > 1):

select a.id, a.name, count(distinct b.id)+count(distinct a.id)
from (select name, id from fruits group by name) a
join fruits b on a.id <> b.id and b.name like concat(a.name, '%')
group by a.name

AFAIK, there is no nice way to convert this query to eloquent. So IMO the best way to perform the query is to do:

DB::select("
    select a.id, a.name, count(distinct b.id)+count(distinct a.id)
    from (select name, id from fruits group by name) a
    join fruits b on a.id <> b.id and b.name like concat(a.name, '%')
    group by a.name
");

Upvotes: 1

Related Questions