Uwe L. Korn
Uwe L. Korn

Reputation: 8796

How to parse CSV files with double-quoted strings in Julia?

I want to read CSV files where the columns are separated by commas. The columns can be strings and if those strings contain a comma in their content, they are wrapped in double-quotes. Currently I'm loading my data using:

file = open("data.csv","r")
data = readcsv(file)

But this code code would split the follwing string into 4 pieces whereas it only should be 3:

1,"text, more text",3,4

Is there a way in Julia's Standard Library to parse CSV while respecting quoting or do I have to write my own custom solution?

Upvotes: 3

Views: 1596

Answers (2)

tan
tan

Reputation: 41

The readcsv function in base (0.3 prerelease) can now read quoted columns.

julia> readcsv(IOBuffer("1,\"text, more text\",3,4")) 1x4 Array{Any,2}: 1.0 "text, more text" 3.0 4.0

It is much simpler than DataFrames. But may be quicker if you just need the data as an array.

Upvotes: 4

astrieanna
astrieanna

Reputation: 768

The readcsv function in base is super-basic (just blindly splitting on commas).

You will probably be happier with readtable from the DataFrames.jl package: http://juliastats.github.io/DataFrames.jl/io.html

To use the package, you just need to Pkg.add("DataFrames"), and then import it with `using DataFrames"

Upvotes: 7

Related Questions