DNorthrup
DNorthrup

Reputation: 847

Using Roo with Ruby(Rails) to parse Excel

I'm trying to allow users to upload a CSV/Excel document, and parse it using Roo (The most suggested one I've seen), but I'm having a bit of issues figuring it out.

Current Script

require 'roo'

xlsx = Roo::Excelx.new("./TestAppXL.xlsx")
xlsx.each_row_streaming do |row|
  puts row.inspect # Array of Excelx::Cell objects
end

This was the only one I was able to get work - It returns what looks to be JSONB.

What I'm trying to do is a few part process:

A) User Uploads a list of 'cards' to my website.(Trying to allow as many options as possible. CSV, Excel, etc)

B) It instantly returns a list of the headers and asks 'Which header is name, quantity, etc etc?'

C) I parse the data for specifics headers and do 'X'.

B is what I primarily need help with. I'm struggling to figure out Roo exactly. I won't have control over the headers so I can't use numerical column numbers.

(Adding in Rails tag since this will be in a controller in the end, maybe an easier way to do it.)

Updated Script

I've actually made a lot of progress. Still trying to get closer to my original request.

require 'roo'
require 'roo-xls'

xlsx = Roo::Spreadsheet.open('Demo.xls')
headers =  xlsx.first_row
puts xlsx.row(headers)
puts "Which number header is the Card Name?"
CardName = gets
puts xlsx.column(CardName.to_i)
# => Returns basic info about the spreadsheet file

Need a lot more logic on the gets but currently if I put in '3' it will return all content of Column 'CardName'. Working on iterating over the rows now.

Psuedo working script

require 'roo'
require 'roo-xls'

xlsx = Roo::Spreadsheet.open('Demo.xls')
headers =  xlsx.first_row
puts xlsx.row(headers)
puts "Which number header is the Card Name?"
CardName = gets.to_i
specHeader = xlsx.cell(headers,CardName)
xlsx.column(CardName).drop(0).each_with_index do |item, index|
    if index == 0

    else
        puts item
    end
end

This is actually performing as expected, and I can start feeding the file into a Rake job now. Still working on some of the iteration but this is very close.

Upvotes: 4

Views: 9906

Answers (1)

peter
peter

Reputation: 42192

I made you a generic way to extract data out of a Roo spreadsheet based on a few header names which would be the convention to use by your uploaders.

require 'roo'
require 'roo-xls'

xlsx = Roo::Spreadsheet.open('Demo.xls')
first_row = xlsx.first_row
headers = ['CardName', 'Item']
headers.each{|h|Kernel.const_set(h, xlsx.row(first_row).index{|e| e =~ /#{h}/i})}
begin
  xlsx.drop(first_row).each do |row|
    p [row[CardName], row[Item]]
  end
rescue
  # the required headers are not all present
end

I suppose the only line that needs explaining is headers.each{|h|Kernel.const_set(h, xlsx.row(first_row).index{|e| e =~ /#{h}/i})}

for each headername assign to it with const_set the index of it in xlsx.row(first_row) (our headerrow) where the regular expression /h/i returns an index, the #{} around h is to expand the h into its value, 'CardName' in the first case, the i at the end of /h/i means the case is to be ignored, so the constant CardName is assigned the index of the string CardName in the headerrow. Instead of the rather clumsy begin rescue structure you could check if all required constants are present with const_get and act upon that instead of catching the error.

EDIT

instead of the p [row[CardName], row[Item]] you could check and do anything, only keep in mind that if this is going to be part of a Rails or other website the interaction with the user is going to be tickier than your puts and get example. Eg something like

headers = ['CardName', 'Item', 'Condition', 'Collection']
...
xlsx.drop(first_row).each do |row|
  if row[CardName].nil? || row[Item].nil?
    # let the user know or skip
  else
    condition, collection = row[Condition], row[Collection]
    # and do something with it
  end
end

Upvotes: 4

Related Questions