Reputation: 196
I'm using a rake task requiring Ruby's CSV class to import rows of property data, and would like to manipulate this data before inserting it into the database.
CSV
PID,City,Address,Sold Date,Sold Price
100-200-300,Vancouver,510 1700 Nelson Street,01/01/2017,"$500,000 "
200-300-400,Vancouver,304 68 Smithe Street,02/02/2017,"600,000"
Residence Table (shortened for brevity)
+-----+------+------+---------------+-------------+
| pid | city | unit | street_number | street_name |
+-----+------+------+---------------+-------------+
| | | | | |
+-----+------+------+---------------+-------------+
Rake Task (what I have so far)
require 'csv'
desc 'Upload CSV data into database'
task residences: :environment do
residences = Array.new
counter = 0
csv_file = "#{Rails.root}/public/spreadsheets/unformatted-addresses.csv"
CSV.foreach(csv_file, headers: true, header_converters: :symbol, converters: :all, skip_blanks: true, encoding: 'UTF-8') do |row|
#is this the right place to create the hash?
residences << row.to_hash
#is this the right way to format each cell?
residences[counter][:pid]
residences[counter][:city].downcase
residences[counter][:address].downcase.split(" ")
residences[counter][:sold_date]
residences[counter][:sold_price].delete('$ ,').to_i
Residence.create( #what to put here? )
counter += 1
end
puts "Imported #{counter} rows."
end
What I would like to achieve is to individually format the cell contents then insert into appropriate columns, for example the address format should be:
"Unit", "Street Number", "Street Name"
Would greatly appreciate any help with this!
Upvotes: 0
Views: 119
Reputation: 196
The final result is below.
require 'csv'
require 'time'
namespace :csv do
desc 'Upload CSV data into database'
task residences: :environment do
residences = []
counter = 0
csv_file = "#{Rails.root}/public/spreadsheets/unformatted-addresses.csv"
address_regex = /^(\d+[a-z]?)+\s+(\d+)+\s+(.+(?=\W))+\s+(.*)/i
CSV.foreach(csv_file, headers: true, header_converters: :symbol, converters: :all, skip_blanks: true, encoding: 'UTF-8') do |row|
address = address_regex.match(row[:address])
unit = address[1]
street_number = address[2]
street_name = address[3]
street_type = address[4]
pid = row[:pid].strip
city = row[:city].strip.downcase
date = Date.parse(row[:sold_date])
sold_date = date.strftime("%m-%d-%Y")
sold_price = row[:sold_price].strip.delete('$ ,').to_i
puts "#{address}, #{pid}, #{city}, #{sold_date}, #{sold_price}"
Residence.create(
pid: pid,
city: city,
unit: unit,
street_number: street_number,
street_name: street_name,
street_type: street_type,
sold_date: sold_date,
sold_price: sold_price
)
counter += 1
end
puts "Imported #{counter} rows."
end
end
Upvotes: 1
Reputation: 611
Adding onto my previous answer, you should be able to do something like this:
require 'csv'
address_regex = /(^\d+[a-z]?)+\s+(\d+)+\s+(.*)/i
desc 'Upload CSV data into database'
task residences: :environment do
counter = 0
csv_file = "#{Rails.root}/public/spreadsheets/unformatted-addresses.csv"
CSV.foreach(csv_file, headers: true, header_converters: :symbol, converters: :all, skip_blanks: true, encoding: 'UTF-8') do |row|
address = address_regex.match(row[:address])
Residence.create(
pid: row[:pid],
city: row[:city],
unit: address[1],
street_number: address[2],
street_name: address[3]
)
counter += 1
end
puts "Imported #{counter} rows."
end
Upvotes: 2
Reputation: 611
This should work for what you're trying to do, assuming that every address is going to have a unit (it will also include any units with a character such as '12A':
address_regex = /(^\d+[a-z]?)+\s+(\d+)+\s+(.*)/i
matches = address_regex.match(residences[counter][:address])
unit = matches[1]
street_number = matches[2]
street_name = matches[3]
Note this isn't the most efficient code, I'm just aiming for clarity
Upvotes: 0