Vishal Mishra
Vishal Mishra

Reputation: 35

How to compare data in two CSV files

I have two CSV files which have the same structure and ideally should have the same data.

I want to compare the data in them using Ruby and wanted to know if we already have a Ruby function for the same.

Upvotes: 3

Views: 8033

Answers (2)

Legat
Legat

Reputation: 1449

If you want to check whether files are identical you can simply use identical? which is an alias for compare_file:

FileUtils.identical?('file1.csv', 'file2.csv')

If you want to see the differences you might want to use diffy:

gem install diffy
puts Diffy::Diff.new('file1.csv', 'file2.csv',  :source => 'files')

It produces diff-like output which can be nicely formatted as HTML:

puts Diffy::Diff.new('file1.csv', 'file2.csv',  :source => 'files').to_s(:html_simple)

Upvotes: 6

Martin
Martin

Reputation: 7714

As Summea commented, look at the CSV class.

Then use:

#Will store each line of each file as an array of fields (so an array of arrays).
file1_lines = CSV.read("file1.csv")
file2_lines = CSV.read("file2.csv")

for i in 0..file1_lines.size
  if (file1_lines[i] == file2_lines[i]
    puts "Same #{file1_lines[i]}"
  else
    puts "#{file1_lines[i]} != #{file2_lines[i]}"
  end
end

Note that using for in Ruby is quite rare. You normally iterate using an each on the collections, but there are two of them here.

Also, pay attention that one of the list may be longer than the other, but this should get you started.

Upvotes: 4

Related Questions