Reputation: 10422
I have below query:
SELECT "users".*
FROM "users"
INNER JOIN "users_roles"
ON "users_roles"."user_id" = "users"."id"
INNER JOIN "roles"
ON "roles"."id" = "users_roles"."role_id"
LEFT JOIN events_users
ON events_users.user_id = users.id
LEFT JOIN events
ON events.id = events_users.event_id
LEFT JOIN booths
ON booths.user_id = users.id
GROUP BY
users.id,
roles.id,
events.id,
booths.id
ORDER BY
id ASC
I want to be able to remove duplicates but it seems like it's generating multiple duplicate roles, booths and events if it appears more than once.
Below is the Rails active record command that generates above query in SQL:
users = User.
joins(:roles).
joins("LEFT JOIN events_users ON events_users.user_id = users.id LEFT JOIN events ON events.id = events_users.event_id").
joins("LEFT JOIN booths ON booths.user_id = users.id").
group("users.id, roles.id, events.id").
order("#{sort_column} #{sort_direction}")
I've also tried below with no luck:
users = User.
joins(:roles).
joins("LEFT JOIN events_users ON events_users.user_id = users.id LEFT JOIN events ON events.id = events_users.event_id").
joins("LEFT JOIN booths ON booths.user_id = users.id").
group("users.id, roles.id, events.id").
order("#{sort_column} #{sort_direction}")
select("distinct on(users.id, roles.id, events.id, booths.id) users.*")
Is there a way to remove all of duplicates in the result set?
Upvotes: 2
Views: 1950
Reputation: 1436
I'm new to ruby and not entirely comfortable with manipulating databases and I prefer a pure ruby solution. I had an Assignment
join table with :listing_id
and :school_id
and my code resulted in hundreds of thousands of duplicate entries, so school.listings
resulted in many duplicate listings. First I fixed the code problem by using Assignment.find_or_create_by
instead of Assignment.create
, then I used the below rake task to remove the duplicate entries. It took 30 minutes to remove the duplicates, so certainly there is a better way to do this, but I was happy with this result because it worked.
desc "remove duplicate relationships in Assignment"
task :clean_assignment => :environment do
listing_ids = Assignment.pluck(:listing_id)
listing_ids = listing_ids.uniq
listing_ids.each do |listing_id|
count = 0
assignments = Assignment.where(:listing_id => listing_id)
school_ids = []
assignments.each do |assign|
if school_ids.include?(assign.school_id)
assign.destroy
count += 1
else
school_ids << assign.school_id
end
end
if count > 0
p "#{count} duplicates deleted from #{listing_id}"
end
end
end
After the rake task was finished, I checked there were no duplicates:
a = Assignment.pluck(:listing_id, :school_id)
b = a.uniq
irb(main):023:0> a.count
=> 191350
irb(main):024:0> b.count
=> 191350
The duplicates are removed.
Upvotes: 0
Reputation: 2049
Try using the DISTINCT
clause on your select statement. It's almost always better if you leave that kind of work to SQL.
SELECT DISTINCT "users".*
FROM "users"
INNER JOIN "users_roles"
ON "users_roles"."user_id" = "users"."id"
INNER JOIN "roles"
ON "roles"."id" = "users_roles"."role_id"
LEFT JOIN events_users
ON events_users.user_id = users.id
LEFT JOIN events
ON events.id = events_users.event_id
LEFT JOIN booths
ON booths.user_id = users.id
GROUP BY users.id,
roles.id,
events.id,
booths.id
ORDER BY id ASC
Upvotes: 2
Reputation: 4093
I don't really know about the SQL solution, but I think a pure Ruby solution would be to use the uniq
method of an Array
This is the doc : http://www.ruby-doc.org/core-2.1.1/Array.html#method-i-uniq
This method allows you to remove all duplicates of an array. For example :
[User.first, User.first].uniq # => [User.first]
Hope it will help you !
Upvotes: 0
Reputation: 3128
Try this query in postgresql
to remove all duplicate rows:
delete from table1 where ctid not in
(select max(t1.id) from
(select ctid id,* from table1)t1
group by t1.name,t1.family);
for deleting duplicate rows you need a unique value in rows postgresql
give us ctid
as a unique value for each row in tables there for we can use ctid
to remove all duplicate rows.
SELECT DISTINCT * FROM
(SELECT "users".*
FROM "users"
INNER JOIN "users_roles" ON "users_roles"."user_id" = "users"."id"
INNER JOIN "roles" ON "roles"."id" = "users_roles"."role_id"
LEFT JOIN events_users ON events_users.user_id = users.id
LEFT JOIN events ON events.id = events_users.event_id
LEFT JOIN booths ON booths.user_id = users.id
GROUP BY users.id, roles.id, events.id, booths.id
ORDER BY id asc)t1;
Upvotes: 1