The Unfun Cat
The Unfun Cat

Reputation: 31908

How do I read a gzipped CSV file in Julia?

I have tried many libraries, but it seems that I cannot get the types to match.

Typical attempt:

using SomeLib, CSV
fh = SomeLib.open("gzipped_file.gz")
CSV.read(fh) # error

Example:

using CodecZlib
CSV.read(GzipDecompressorStream(open("gzipped_file.gz")))
# ERROR: MethodError: no method matching position(::TranscodingStreams.TranscodingStream{GzipDecompressor,IOStream})

Upvotes: 8

Views: 2810

Answers (4)

bicycle1885
bicycle1885

Reputation: 468

My new package TableReader.jl supports transparent gzip, xz, and zstd decompression. So, the following code will work as you expect:

using TableReader

readcsv("path/to/file.csv.gz")
readcsv("path/to/file.csv.xz")
readcsv("path/to/file.csv.zst")

Upvotes: 0

harryscholes
harryscholes

Reputation: 1667

Even more simple:

using CSVFiles, DataFrames
df = DataFrame(load(File(format"CSV", "data.csv.gz")))

Upvotes: 0

user2317421
user2317421

Reputation:

Adding to Bogumił's answer, you can do the following as well:

using CSV
using GZip

df = GZip.open("some_file.csv.gz", "r") do io
    CSV.read(io)
end

Upvotes: 4

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69839

In the meantime you can use CSVFiles.jl:

using CSVFiles, DataFrames, FileIO

open("yourfile.csv.gz") do io
    load(Stream(format"CSV", GzipDecompressorStream(io))) |> DataFrame
end

Upvotes: 7

Related Questions