Seanny123
Seanny123

Reputation: 9346

Read CSV into array

In Julia, using CSV.jl, it is possible to read a DataFrame from a .csv file:

using CSV

df = CSV.read("data.csv", delim=",")

However, how can I instead read a CSV file into an Vector{Float64} data type?

Upvotes: 18

Views: 13373

Answers (4)

mbauman
mbauman

Reputation: 31342

You can ask CSV.read to use a Matrix as its destination in one go with:

julia> import CSV

julia> s = """
       1,2,3
       4,5,6
       7,8,9""";

julia> CSV.read(IOBuffer(s), CSV.Tables.matrix; header=false)
3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

Do note that there's a currently-outstanding issue to directly use the builtin Matrix type itself as the "sink", which would make this slightly more discoverable.

Upvotes: 2

Shep Bryan
Shep Bryan

Reputation: 669

To summarize Bogumil's answer, your can use:

using DelimitedFiles
data = readdlm("data.csv", ',', Float64)

Upvotes: 3

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69839

You can use the DelimitedFiles module from stdlib:

julia> using DelimitedFiles

julia> s = """
       1,2,3
       4,5,6
       7,8,9"""
"1,2,3\n4,5,6\n7,8,9"

julia> b = IOBuffer(s)
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=17, maxsize=Inf, ptr=1, mark=-1)

julia> readdlm(b, ',', Float64)
3×3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0

I am showing you the example reading from IOBuffer to be fully reproducible, but you can also read data from file. In the docstring of readdlm you can find more details about the available options.

Notice that you will get Matrix{Float64} not Vector{Float64}, but I understand that this is what you wanted. If not then in order to convert a matrix into a vector you can call vec function on it after reading the data in.

EDIT

This is how you can read back a Matrix using CSV.jl:

julia> df = DataFrame(rand(2,3))
2×3 DataFrame
│ Row │ x1        │ x2       │ x3       │
│     │ Float64   │ Float64  │ Float64  │
├─────┼───────────┼──────────┼──────────┤
│ 1   │ 0.0444818 │ 0.570981 │ 0.608709 │
│ 2   │ 0.47577   │ 0.675344 │ 0.500577 │

julia> CSV.write("test.csv", df)
"test.csv"

julia> CSV.File("test.csv") |> Tables.matrix
2×3 Array{Float64,2}:
 0.0444818  0.570981  0.608709
 0.47577    0.675344  0.500577

Upvotes: 15

hckr
hckr

Reputation: 5583

You can convert your DataFrame to a Matrix of a certain type. If there is no missing data this should work. If there is missing data, simply omit the type in convert.

arr = convert(Matrix{Float64}, df)

You can call vec on the result to get a vector if that is really what you want.

Depending on the file, I would go with readdlm as suggested in the previous answer.

Upvotes: 4

Related Questions