Reputation: 3412
I have this CSV file:
1,a,"first letter","[1,2,3,4]"
2,b,"second letter","[2,5,6,8]"
...
In this file, the first column is an integer, the second column is an alphabetic character, the third column is a string and fourth column is an array of integers.
I often read CSV files using:
array = []
CSV.foreach("path/to/file.csv") do |row|
array << row
end
But I need parsed data.
Is there a way to parse correctly datatypes from loading?
Upvotes: 1
Views: 590
Reputation: 160601
CSV can only return text, because the source file is only text, and there is nothing inside the record/line that specifies what each column type is.
If you are in control of the data file creation, you can use YAML or JSON to serialize the data, and it will be returned as strings and numerics, and, if you're willing to forgo the ability to use the file with other languages, you can actually return Ruby objects. (I'd recommend sticking with more generic serializing though.)
If you're stuck with CSV, then you'll need to provide code to convert the fields to the types you want, which isn't hard. Something like this untested code should get you on your way:
array = []
CSV.foreach("path/to/file.csv") do |row|
int, alpha, str, ary_of_int = row
array << [int.to_i, alpha, str, ary_of_int.scan(/\d+/).map(&:to_i)]
end
JSON makes it easy to move data around and recover it from its serialized state:
require 'json'
ary = [
[1, 'a', "first letter", [1,2,3,4]],
[2, 'b', "second letter", [2,5,6,8]]
]
json_ary = JSON[ary]
puts json_ary
# >> [[1,"a","first letter",[1,2,3,4]],[2,"b","second letter",[2,5,6,8]]]
require 'pp'
pp JSON[json_ary]
# >> [[1, "a", "first letter", [1, 2, 3, 4]],
# >> [2, "b", "second letter", [2, 5, 6, 8]]]
JSON.[]
looks to see whether the parameter received is a string, or an array or hash. If it's a string it attempts to parse the data. If it's an array or hash it attempts to convert it to a JSON string.
YAML works similarly:
require 'yaml'
ary = [
[1, 'a', "first letter", [1,2,3,4]],
[2, 'b', "second letter", [2,5,6,8]]
]
yaml_ary = ary.to_yaml
puts yaml_ary
# >> ---
# >> - - 1
# >> - a
# >> - first letter
# >> - - 1
# >> - 2
# >> - 3
# >> - 4
# >> - - 2
# >> - b
# >> - second letter
# >> - - 2
# >> - 5
# >> - 6
# >> - 8
require 'pp'
pp YAML.load(yaml_ary)
# >> [[1, "a", "first letter", [1, 2, 3, 4]],
# >> [2, "b", "second letter", [2, 5, 6, 8]]]
You could use XML, but it still only knows its content is a text node. You have to write code to interpret the XML and convert the data values to the appropriate types.
Upvotes: 2
Reputation: 7333
Not built in, however:
1,a,"first letter","[1,2,3,4]"
to get the integer I would just call .to_i
. for the array you could:
require "json"
JSON.parse("[1,2,3,4]")
=> [1, 2, 3, 4]
Upvotes: 1
Reputation: 1501
There is no built in support for this in standart CSV package, although array like "[1,2,3,4]" is just a string for ruby, actually anything is a string even numbers. You need to make this parsing by your own
Upvotes: 1