Reputation: 164769
I have two classes with a has_and_belongs_to_many
relationship.
class Customer < ApplicationRecord
has_and_belongs_to_many :segments
def rematch_segments
self.segments = Segment.customer_segments(self)
end
end
class Segment < ApplicationRecord
has_and_belongs_to_many :customers
class << self
def customer_segments(customer)
...returns a collection of Segments...
end
end
end
Calling rematch_segments
results in an insert for each segment.
(0.2ms) BEGIN
Customer::HABTM_Segments Create (0.8ms) INSERT INTO "customers_segments" ("customer_id", "segment_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) [["customer_id", 1], ["segment_id", 1], ["created_at", "2020-03-24 23:42:52.985400"], ["updated_at", "2020-03-24 23:42:52.985400"]]
Customer::HABTM_Segments Create (0.2ms) INSERT INTO "customers_segments" ("customer_id", "segment_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) [["customer_id", 1], ["segment_id", 2], ["created_at", "2020-03-24 23:42:52.987537"], ["updated_at", "2020-03-24 23:42:52.987537"]]
Customer::HABTM_Segments Create (0.2ms) INSERT INTO "customers_segments" ("customer_id", "segment_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4) [["customer_id", 1], ["segment_id", 3], ["created_at", "2020-03-24 23:42:52.988610"], ["updated_at", "2020-03-24 23:42:52.988610"]]
(0.3ms) COMMIT
How can I instead do a single bulk insert?
Upvotes: 2
Views: 2221
Reputation: 6063
You can make the association visible by creating a class for your join table:
class Customer < ApplicationRecord
has_many :segments, through: :customer_segments
has_many :customer_segments
end
class Segment < ApplicationRecord
has_many :customers, through: :customer_segments
has_many :customer_segments
end
class CustomerSegment < ApplicationRecord
belongs_to :customer
belongs_to :segment
end
This way you can use Rails 6's insert_all()
(or the activerecord-import
gem) to bulk insert the join table directly:
CustomerSegment.insert_all([
{ customer_id: 1, segment_id: 1 },
{ customer_id: 1, segment_id: 2 },
…
])
Upvotes: 4
Reputation: 4378
Rails 6 now supports bulk insert feature which can be used to insert multiple records in database in a single query. There is a insert_all
method which accepts an array of hash representing each row to be inserted in the database. By default it skips duplicate rows
Inserts multiple records into the database in a single SQL INSERT statement. It does not instantiate any models nor does it trigger Active Record callbacks or validations. Though passed values go through Active Record's type casting and serialization.
The attributes parameter is an Array of Hashes. Every Hash determines the attributes for a single row and must have the same keys.
Rows are considered to be unique by every unique index on the table. Any duplicate rows are skipped. Override with :unique_by (see below).
Assuming you've a CustomerSegment
class representing your join model.
You can do something like this:
CustomerSegment.insert_all([{ customer_id: 54, segment_id: 2 },{ customer_id: 143, segment_id: 222 }...])
Upvotes: 2
Reputation: 3535
If you are focusing on performance, I would recommend using raw SQL rather than ORM which can have performance drawbacks relatively.
To simplify code, ['created_at', 'updated_at'] were removed.
You may consider building INSERT QUERY like the following.
INSERT INTO customers_segments (customer_id,segment_id) VALUES (1,1),(1,2),(1,3) ...
According to records size, you may need to set batch size. Usually, it should be fine to manage thousands records without batch size.
segments = Segment.find_by_attr(:attr_val) #please do not forget indexing of the attr in DB.
customer = Customer.create
stmt = segments.map {|seg| "(#{customer.id},#{seg.id})"}.join(",")
ActiveRecord::Base.connection.execute("INSERT INTO customers_segments (customer_id, segment_id) VALUES #{stmt}")
class Customer < ApplicationRecord
...
def batch_rematch(attr_array)
attr_val = attr_array.map{|a| %Q{'#{a}'} }.uniq.join(",")
self.connection.execute(%Q{insert into customers_segments (customer_id,segment_id)
select distinct customers.id,segments.id
from customers,segments
where customers.id = #{self.id}
and segments.<attr> in (#{attr_val})})
end
end
Upvotes: 0