user3818382
user3818382

Reputation: 27

How to import a column of a CSV file into a Ruby array?

My goal is to import a one column of a CSV file into a Ruby array. This is for a self-contained Ruby script, not an application. I'll just be running the script in Terminal and getting an output.

I'm having trouble finding the best way to import the file and finding the best way to dynamically insert the name of the file into that line of code. The filename will be different each time, and will be passed in by the user. I'm using $stdin.gets.chomp to ask the user for the filename, and setting it equal to file_name.

Can someone help me with this? Here's what I have for this part of the script:

require 'csv'
    zip_array = CSV.read("path/to/file_name.csv")

and I need to be able to insert the proper file path above. Is this correct? And how do I get that path name in there? Maybe I'll need to totally re-structure my script, but any suggestions on how to do this?

Upvotes: 0

Views: 3678

Answers (3)

daremkd
daremkd

Reputation: 8424

Okay, first problem:

a) The file name will be different on each run (I'm supposing it will always be a CSV file, right?)

You can solve this problem with creating a folder, say input_data inside your Ruby script. Then do:

Dir.glob('input_data/*.csv')

This will produce an array of ALL files inside that folder that end with CSV. If we assume there will be only 1 file at a time in that folder (with a different name), we can do:

file_name = Dir.glob('input_data/*.csv')[0]

This way you'll dynamically get the file path, no matter what the file is named. If the csv file is inside the same directory as your Ruby script, you can just do:

Dir.glob('*.csv')[0]

Now, for importing only 1 column into a Ruby array (let's suppose it's the first column):

require 'csv'
array = []
CSV.foreach(file_name) do |csv_row|
  array << csv_row[0] # [0] for the first column, [1] for the second etc.
end

What if your CSV file has headers? Suppose your column name is 'Total'. You can do:

require 'csv'
array = []
CSV.foreach(file_name, headers: true) do |csv_row|
  array << csv_row['Total']
end

Now it doesn't matter if your column is the 1st column, the 3rd etc, as long as it has a header named 'Total', Ruby will find it.

CSV.foreach reads your file line-by-line and is good for big files. CSV.read will read it at once but using it you can make your code more concise:

array = CSV.read(, headers: true).map do |csv_row|
  csv_row['Total']
end

Hope this helped.

Upvotes: 2

ptd
ptd

Reputation: 3053

There are two questions here, I think. The first is about getting user input from the command line. The usual way to do this is with ARGV. In your program you could do file_name = ARGV[0] so a user could type ruby your_program.rb path/to/file_name.csv on the command line.

The next is about reading CSVs. Using CSV.read will take the whole CSV, not just a single column. If you want to choose one column of many, you are likely better off doing:

zip_array = []
CSV.foreach(file_name) { |row| zip_array << row[whichever_column] }

Upvotes: 2

the Tin Man
the Tin Man

Reputation: 160551

First, you need to assign the returned value from $stdin.gets.chomp to a variable:

foo = $stdin.gets.chomp

Which will assign the entered input to foo.

You don't need to use $stdin though, as gets will use the standard input channel by default:

foo = gets.chomp

At that point use the variable as your read parameter:

zip_array = CSV.read(foo)

That's all basic coding and covered in any intro book for a language.

Upvotes: 1

Related Questions