animekun
animekun

Reputation: 1869

Rails Rake Task How to parse CSV with commas in fields

I have a csv, that contains float numbers with commas except of dots like this "34,21", and I need to parse it in my rake task, I have already tried some solutions like this: Ruby on Rails - Import Data from a CSV file

But none of them doesn't seem to be working properly, they just parse it like 2 fields (32 and 21). Is there a way to fix it with using built-in CSV?

I have already tried this:

task :drugimport, [:filename, :model] => :environment do |task,args|
    CSV.foreach(args[:filename], { encoding: "UTF-8", headers: true, header_converters: :symbol,
        converters: :all}) do |row|
            Moulding.create!(row.to_hash)
        end
end

And this one:

require 'smarter_csv'
options = {}
SmarterCSV.process('input_file.csv', options} do |chunk|
   chunk.each do |data_hash|
       Moulding.create!( data_hash )
   end
end

They both look nice and elegant, except of wrong parsing of fields containing commas.

here is my rows, sorry there is russian, but whatever: http://pastebin.com/RbC4SVzz I didn't changed anything in it, so I pasted to pastebin, will be more useful then here, I guess

here is my import log: http://pastebin.com/rzC0h9rS

Upvotes: 1

Views: 1598

Answers (2)

Tim
Tim

Reputation: 1376

Right, so from what I am seeing you are, as you understand yourself, not passing any options to the parser. When not indicating row_sep or any other form of option, smarter_csv will use the system new line separator which is "\r\n" for windows machines, and "\r" for unix machines.

That being said, try the following...

require 'smarter_csv'
SmarterCSV.process('input_file.csv', :row_sep => :auto, :row_sep => ","} do |chunk|
  chunk.each do |data_hash|
    Moulding.create!( data_hash )
  end
end

I agree with Swards. What I have done assumes quite a lot of things. A glance at some CSV data could be useful.

Upvotes: 1

OpenGears
OpenGears

Reputation: 142

In my opinion, you have three possible roads you could go:

1) work with the "bad" input and try to find a workaround

You could try and work line by line and try

line.split (" ,")

which would assume that there is a blank space before the comma. Another approach would be to identify the numerical values via regex and replacing the comma character (this might be easier to fix on the source data!)

2) try to export the CSV with another separator

This depends on where the data comes from. If you can re-export the data, maybe that's the most easy solution. In this case of course, your data would technically not be CSV anymore, but for example SSV (semi-colon-separated values).

3) try other CSV parsers

I can definitely suggest you take a look at other CSV parsers, such as fasterCSV and others (see a list of CSV parsers at ruby-toolbox)

I hope this is helpful advice - sample CSV data would definitely help to help you.

Upvotes: 1

Related Questions