chicjean
chicjean

Reputation: 77

CSV file encoding in Rails with S3 and Heroku

My rails app uploads CSV files to S3, then subsequently pulls them down into a tempfile to send each row's data to a Sidekiq worker. I'm using Carrierwave and fog to handle the uploading.

This all worked beautifully until recently switching to Heroku, and now, when trying to create my tempfile I get the following error:

Error type Encoding::UndefinedConversionError
Error message "\xA2" from ASCII-8BIT to UTF-8

I've tried setting the encoding when creating the tempfile as well as working with the CSV file and continue to get the same error. I cannot reproduce this error on my local machine, which has made this entire process that much more fun :)

Currently, my Sidekiq worker calls the following method:

def upload_csv(filename, file_path)
  file = Tempfile.new(filename, Rails.root.join('tmp'), encoding: "ISO8859-1:utf-8").tap do |f|
   open(file_path).rewind
   f.write(open(file_path).read)
   f.close
  end

  CSV.foreach(file, headers: true, encoding: "ISO8859-1:utf-8")do |row|
   #do stuff to rows
  end
end

I understand the very basics of encoding, but I'm super stuck on this. Any insight would be appreciated.

Thanks!

Upvotes: 0

Views: 887

Answers (1)

chicjean
chicjean

Reputation: 77

Not sure if this will help anyone else, but I found a solution that works for me:

def upload_csv(filename, file_path)
  file = Tempfile.new(filename, Rails.root.join('tmp')).tap do |f|
   open(file_path).rewind
   f.write(open(file_path).read.force_encoding('utf-8'))
   f.close
  end

  CSV.foreach(file, headers: true)do |row|
   #do stuff to rows
  end
end

Even though I could confirm that the file was UTF-8 encoded before it was uploaded, open(@file_path).read.encoding returning an ASCII-8BIT encoding. It was getting confused on how to write the file and convert it from ASCII-8BIT to UTF-8.

Upvotes: 1

Related Questions