Reputation: 914
I'm trying to build a rake utility that will update my database every so often.
This is the code I have so far:
namespace :utils do
# utils:update_ip
# Downloads the file frim <url> to the temp folder then unzips it in <file_path>
# Then updates the database.
desc "Update ip-to-country database"
task :update_ip => :environment do
require 'open-uri'
require 'zip/zipfilesystem'
require 'csv'
file_name = "ip-to-country.csv"
file_path = "#{RAILS_ROOT}/db/" + file_name
url = 'http://ip-to-country.webhosting.info/downloads/ip-to-country.csv.zip'
#check last time we updated the database.
mod_time = ''
mod_time = File.new(file_path).mtime.httpdate if File.exists? file_path
begin
puts 'Downloading update...'
#send conditional GET to server
zipped_file = open(url, {'If-Modified-Since' => mod_time})
rescue OpenURI::HTTPError => the_error
if the_error.io.status[0] == '304'
puts 'Nothing to update.'
else
puts 'HTTPError: ' + the_error.message
end
else # file was downloaded without error.
Rails.logger.info 'ip-to-coutry: Remote database was last updated: ' + zipped_file.meta['last-modified']
delay = Time.now - zipped_file.last_modified
Rails.logger.info "ip-to-country: Database was outdated for: #{delay} seconds (#{delay / 60 / 60 / 24 } days)"
puts 'Unzipping...'
File.delete(file_path) if File.exists? file_path
Zip::ZipFile.open(zipped_file.path) do |zipfile|
zipfile.extract(file_name, file_path)
end
Iptocs.delete_all
puts "Importing new database..."
# TODO: way, way too heavy find a better solution.
CSV.open(file_path, 'r') do |row|
ip = Iptocs.new( :ip_from => row.shift,
:ip_to => row.shift,
:country_code2 => row.shift,
:country_code3 => row.shift,
:country_name => row.shift)
ip.save
end #CSV
puts "Complete."
end #begin-resuce
end #task
end #namespace
The problem I'm having is that this takes a few minutes to enter the 100 thousand plus entries. I'd like to find a more efficient way of updating my database. Ideally this will remain independent of the database type, but if not my production server will be running on MySQL.
Thank you for any insight.
Upvotes: 5
Views: 11879
Reputation: 16732
I'm currently experimenting with activerecord-import, which sounds very promising:
https://github.com/zdennis/activerecord-import
Upvotes: 1
Reputation: 64363
Have you tried to use AR Extensions for bulk import? You get impressive performance improvements when you are inserting 1000's of rows to DB. Visit their website for more details.
Refer to these examples for more information
Upvotes: 9
Reputation: 29432
As Larry says, use your DB-specific import utilities if the file comes in the format you want. However, if you need to manipulate the data before inserting, you can generate a single INSERT query with data for many rows, which is faster than using a separate query for each row (as ActiveRecord will do). For example:
INSERT INTO iptocs (ip_from, ip_to, country_code) VALUES
('xxx', 'xxx', 'xxx'),
('yyy', 'yyy', 'yyy'),
...;
Upvotes: 0
Reputation: 49104
Use the database level utilities for high speed Luke!
Unfortunately, they're db specific. But they are fast For mysql, see http://dev.mysql.com/doc/refman/5.1/en/load-data.html
Upvotes: 3
Reputation: 29135
You could generate a text file with all INSERTs you need and then execute:
mysql -u user -p db_name < mytextfile.txt
Not sure if this will be any faster but worth a try...
Upvotes: 1