Sam Saffron
Sam Saffron

Reputation: 131112

has_and_belongs_to_many, avoiding dupes in the join table

I have a pretty simple HABTM set of models

class Tag < ActiveRecord::Base 
   has_and_belongs_to_many :posts
end 

class Post < ActiveRecord::Base 
   has_and_belongs_to_many :tags

   def tags= (tag_list) 
      self.tags.clear 
      tag_list.strip.split(' ').each do 
        self.tags.build(:name => tag) 
      end
   end 
end 

Now it all works alright except that I get a ton of duplicates in the Tags table.

What do I need to do to avoid duplicates (bases on name) in the tags table?

Upvotes: 70

Views: 31161

Answers (12)

Matthew Bennett
Matthew Bennett

Reputation: 303

Just add a check in your controller before adding the record. If it does, do nothing, if it doesn't, add a new one:

u = current_user
a = @article
if u.articles.exists?(a)

else
  u.articles << a
end

More: "4.4.1.14 collection.exists?(...)" http://edgeguides.rubyonrails.org/association_basics.html#scopes-for-has-and-belongs-to-many

Upvotes: 0

Jeremy Lynch
Jeremy Lynch

Reputation: 7210

Prevent duplicates in the view only (Lazy solution)

The following does not prevent writing duplicate relationships to the database, it only ensures find methods ignore duplicates.

In Rails 5:

has_and_belongs_to_many :tags, -> { distinct }

Note: Relation#uniq was depreciated in Rails 5 (commit)

In Rails 4

has_and_belongs_to_many :tags, -> { uniq }

Prevent duplicate data from being saved (best solution)

Option 1: Prevent duplicates from the controller:

post.tags << tag unless post.tags.include?(tag)

However, multiple users could attempt post.tags.include?(tag) at the same time, thus this is subject to race conditions. This is discussed here.

For robustness you can also add this to the Post model (post.rb)

def tag=(tag)
  tags << tag unless tags.include?(tag)
end

Option 2: Create a unique index

The most foolproof way of preventing duplicates is to have duplicate constraints at the database layer. This can be achieved by adding a unique index on the table itself.

rails g migration add_index_to_posts
# migration file
add_index :posts_tags, [:post_id, :tag_id], :unique => true
add_index :posts_tags, :tag_id

Once you have the unique index, attempting to add a duplicate record will raise an ActiveRecord::RecordNotUnique error. Handling this is out of the scope of this question. View this SO question.

rescue_from ActiveRecord::RecordNotUnique, :with => :some_method

Upvotes: 73

Jose Fuentes Delgado
Jose Fuentes Delgado

Reputation: 31

To me work

  1. adding unique index on the join table
  2. override << method in the relation

    has_and_belongs_to_many :groups do
      def << (group)
        group -= self if group.respond_to?(:to_a)
        super group unless include?(group)
      end
    end
    

Upvotes: 3

Javeed
Javeed

Reputation: 99

This is really old but I thought I'd share my way of doing this.

class Tag < ActiveRecord::Base 
    has_and_belongs_to_many :posts
end 

class Post < ActiveRecord::Base 
    has_and_belongs_to_many :tags
end

In the code where I need to add tags to a post, I do something like:

new_tag = Tag.find_by(name: 'cool')
post.tag_ids = (post.tag_ids + [new_tag.id]).uniq

This has the effect of automatically adding/removing tags as necessary or doing nothing if that's the case.

Upvotes: 2

dav1dhunt
dav1dhunt

Reputation: 179

Extract the tag name for security. Check whether or not the tag exists in your tags table, then create it if it doesn't:

name = params[:tag][:name]
@new_tag = Tag.where(name: name).first_or_create

Then check whether it exists within this specific collection, and push it if it doesn't:

@taggable.tags << @new_tag unless @taggable.tags.exists?(@new_tag)

Upvotes: 1

cyrilchampier
cyrilchampier

Reputation: 2248

In Rails4:

class Post < ActiveRecord::Base 
  has_and_belongs_to_many :tags, -> { uniq }

(beware, the -> { uniq } must be directly after the relation name, before other params)

Rails documentation

Upvotes: 20

ajbraus
ajbraus

Reputation: 2999

You should add an index on the tag :name property and then use the find_or_create method in the Tags#create method

docs

Upvotes: 0

spyle
spyle

Reputation: 2008

In addition the suggestions above:

  1. add :uniq to the has_and_belongs_to_many association
  2. adding unique index on the join table

I would do an explicit check to determine if the relationship already exists. For instance:

post = Post.find(1)
tag = Tag.find(2)
post.tags << tag unless post.tags.include?(tag)

Upvotes: 25

Joshua Cheek
Joshua Cheek

Reputation: 31726

Set the uniq option:

class Tag < ActiveRecord::Base 
   has_and_belongs_to_many :posts , :uniq => true
end 

class Post < ActiveRecord::Base 
   has_and_belongs_to_many :tags , :uniq => true

Upvotes: 12

Jeff Whitmire
Jeff Whitmire

Reputation: 770

I would prefer to adjust the model and create the classes this way:

class Tag < ActiveRecord::Base 
   has_many :taggings
   has_many :posts, :through => :taggings
end 

class Post < ActiveRecord::Base 
   has_many :taggings
   has_many :tags, :through => :taggings
end

class Tagging < ActiveRecord::Base 
   belongs_to :tag
   belongs_to :post
end

Then I would wrap the creation in logic so that Tag models were reused if it existed already. I'd probably even put a unique constraint on the tag name to enforce it. That makes it more efficient to search either way since you can just use the indexes on the join table (to find all posts for a particular tag, and all tags for a particular post).

The only catch is that you can't allow renaming of tags since changing the tag name would affect all uses of that tag. Make the user delete the tag and create a new one instead.

Upvotes: 5

Sam Saffron
Sam Saffron

Reputation: 131112

I worked around this by creating a before_save filter that fixes stuff up.

class Post < ActiveRecord::Base 
   has_and_belongs_to_many :tags
   before_save :fix_tags

   def tag_list= (tag_list) 
      self.tags.clear 
      tag_list.strip.split(' ').each do 
        self.tags.build(:name => tag) 
      end
   end  

    def fix_tags
      if self.tags.loaded?
        new_tags = [] 
        self.tags.each do |tag|
          if existing = Tag.find_by_name(tag.name) 
            new_tags << existing
          else 
            new_tags << tag
          end   
        end

        self.tags = new_tags 
      end
    end

end

It could be slightly optimised to work in batches with the tags, also it may need some slightly better transactional support.

Upvotes: 4

Simone Carletti
Simone Carletti

Reputation: 176402

You can pass the :uniq option as described in the documentation. Also note that the :uniq options doesn't prevent the creation of duplicate relationships, it only ensures accessor/find methods will select them once.

If you want to prevent duplicates in the association table you should create an unique index and handle the exception. Also validates_uniqueness_of doesn't work as expected because you can fall into the case a second request is writing to the database between the time the first request checks for duplicates and writes into the database.

Upvotes: 20

Related Questions