Agans
Agans

Reputation: 121

How do I parse a CSV file located in a Amazon S3 bucket

Below is the code I'm using to parse the CSV from within the app, but I want to parse a file located in a Amazon S3 bucket. It needs to work when pushed to Heroku as well.

namespace :csvimport do
  desc "Import CSV Data to Inventory."
  task :wiwt => :environment do
    require 'csv'

    csv_file_path = Rails.root.join('public', 'wiwt.csv.txt')

    CSV.foreach(csv_file_path) do |row|
      p = Wiwt.create!({
        :user_id => row[0],
        :date_worn => row[1],
        :inventory_id => row[2],
      })
    end
  end
end

Upvotes: 12

Views: 9329

Answers (4)

Brad Said
Brad Said

Reputation: 63

This worked for me

  open(s3_file_path) do |file|
    CSV.foreach(file, {headers: true, header_converters: :symbol}) do |row|
      Model.create(row.to_hash)
    end
  end

Upvotes: 2

marzhaev
marzhaev

Reputation: 497

There are cases with S3, when permissions on S3 Object disallow public access. In-built Ruby functions do assume a path is publicly accessible and don't account for AWS S3 specificity.

s3 = Aws::S3::Resource.new
bucket = s3.bucket("bucket_name_here")
str = bucket.object("file_path_here").get.body.string
content = CSV.parse(str, col_sep: "\t", headers: true).map(&:to_h)

Per-line explanation using AWS SDK: Line 1. Initialize Line 2. Choose a bucket. Line 3. Choose an object and get it as a String. Line 4. Effectively CSV.parse('the string'), but I also added a options and map over it just in case it helps you.

Upvotes: 16

Mike Szyndel
Mike Szyndel

Reputation: 10592

You can do it like this

CSV.new(open(path_to_s3)).each do |row|
  ...
end

Upvotes: 8

Hieu Pham
Hieu Pham

Reputation: 6692

You can get the csv file from S3 like this:

require 'csv'
require 'net/http'

CSV.parse(Net::HTTP.get(s3_file_url), headers: true).each do |row|
# code for processing row here
end

Upvotes: 1

Related Questions