sergserg
sergserg

Reputation: 22264

Getting spreadsheet row count using spreadsheet gem

Here's my current working code:

require 'csv'
require 'spreadsheet'

folder_to_analyze = ARGV.first
folder_path = File.join(Dir.pwd, folder_to_analyze)

unless File.directory?(folder_path)
  puts "Error: #{folder_path} no es un folder valido."
  exit
end

def get_csv_file_paths(path)
  Dir.glob(path + '/**/*.csv').each do |f|
    yield f
  end
end

def get_xls_file_path(path)
  Dir.glob(path + '/**/*.xls').each do |f|
    yield f
  end
end

csv_files = []
excel_files = []
get_csv_file_paths(folder_path) { |f| csv_files << f }
get_xls_file_path(folder_path) { |f| excel_files << f }

puts "Se encontro #{csv_files.length + excel_files.length} archivos para procesar."

puts '==========================================='
puts 'Archivos CSV:'
puts '==========================================='
csv_files.each do |f|
  count = IO.readlines(f).size
  puts "Archivo: #{File.basename(f)} - Correos: #{count}"
end

puts '==========================================='
puts 'Archivos Excel:'
puts '==========================================='
Spreadsheet.client_encoding = 'UTF-8'
excel_files.each do |f|
  count = 0

  book = Spreadsheet.open f
  book.worksheets.each do |sheet|
    sheet.each do |row|
     count = count + 1
    end
  end

  puts "Archivo: #{File.basename(f)} - Correos: #{count}"
end

The Spreadsheet row count calculation is very slow, it takes about 4 seconds per excel file to count.

Is there any way to speed this up? Does it have a row_count property hidden somewhere?

Upvotes: 1

Views: 3158

Answers (1)

There's no need to iterate over the sheets's rows, just call count on each sheet like this:

excel_files.each do |f|
  count = 0

  book = Spreadsheet.open f
  book.worksheets.each do |sheet| 
     count += sheet.count
  end

  puts "Archivo: #{File.basename(f)} - Correos: #{count}"
end

Upvotes: 1

Related Questions