Reputation: 52651
I'm using FasterCSV on a Ruby on Rails application and currently it throws an Exception if the file is invalid.
I've looked over the FasterCSV doc, and it seems that if I use FasterCSV::parse with a block, it'll read the file one line at a time, without allocating too much memory. It'll throw a FasterCSV::MalformedCSV exception if there is any kind of error on the file.
I've implemented a custom solution, but I'm not sure it's the best possible one (see my answer below). I'd be interested in knowing alternatives
Upvotes: 3
Views: 3001
Reputation: 52651
I made some tests yesterday and it turns out that my solution didn't quite work; I kept getting empty arrays on valid CSVs after implementing the first is_valid
. I'm not sure whether it's a FasterCSV caching issue or something in my code, and I don't know if it's related with my test setup, but I decided to go implement a safe_parse
instead:
#/lib/faster_csv_safe_parse.rb
class FasterCSV
def self.safe_parse(file, options = {})
begin
FasterCSV.parse(file, options)
rescue FasterCSV::MalformedCSVError
nil
end
end
end
This will return a parsed array if the file is valid, or nil
otherwise. I could then implement my validations as follows:
# /models/csv_importer.rb
class CsvImporter
include ActiveRecord::Validations
validates_presence_of :file
validate check_file_format
attr_accessor csv_data
def csv_data
@csv_data ||= FasterCSV.safe_parse(file)
end
...
private
def check_file_format
errors.add :file, "Malformed CSV! Please check syntax" if csv_data.nil?
end
end
I guess it would be possible to implement a safe_parse
that accepts a block and parses the file line by line, but for my purposes this simple implementation was enough, and it works in all cases.
Upvotes: 0
Reputation: 52651
This is my current solution. I'm really interested in knowing improvements / alternatives.
# /lib/fastercsv_is_valid.rb
class FasterCSV
def self.is_valid?(file, options = {})
begin
FasterCSV.parse(file, options) { |row| }
true
rescue FasterCSV::MalformedCSV
false
end
end
end
I use that method like this:
# /models/csv_importer.rb
class CsvImporter
include ActiveRecord::Validations
validates_presence_of :file
validate check_file_format
...
private
def check_file_format
errors.add :file, "Malformed CSV! Please check syntax" unless FasterCSV::is_valid? file
end
end
Upvotes: 1
Reputation: 1916
I assume you want to parse the CSV and do something with the parsed results. Worst case is that your CSV is valid and that you parse the file again. I would write something like this to stash away the parsed result so you only have to parse the CSV once:
module FasterCSV
def self.parse_and_validate(file, options = {})
begin
@parsed_result = FasterCSV.parse(file, options) { |row| }
rescue FasterCSV::MalformedCSV
@invalid = true
end
end
def self.is_valid?
!@invalid
end
def self.parsed_result
@parsed_result if self.valid?
end
end
And then:
class CsvImporter
include ActiveRecord::Validations
validates_presence_of :file
validate check_file_format
# I assume you use the parsed result after the validations so in a before_save or something
def do_your_parse_stuff
here you would use FasterCSV::parsed_result
end
...
private
def check_file_format
FasterCSV::parse_and_validate(file)
errors.add :file, "Malformed CSV! Please check syntax" unless FasterCSV::is_valid?
end
end
In the above case, you might want to move stuff into a different class that takes care of communicating with FasterCSV and stashing away the parsed result, because I don't think my example is thread safe :)
Upvotes: -1