Reputation: 1224
I have this gz file from dati.istat.it: within it's a csv file (with different name) that i want load directly in pandas dataframe.
If i unzip with 7zip i easily load with this code
pd.read_csv("DCCV_OCCUPATIT_Data+FootnotesLegend_175b2401-3654-4673-9e60-b300989088bb.csv", sep="|", engine = "python")
how i can do it without unzip with 7zip frist?
thx a lot!
Upvotes: 10
Views: 13295
Reputation: 862841
You can use library zipfile
:
import pandas as pd
import zipfile
z = zipfile.ZipFile('test/file.gz')
print pd.read_csv(z.open("DCCV_OCCUPATIT_Data+FootnotesLegend_175b2401-3654-4673-9e60-b300989088bb.csv"),
sep="|",
engine = "python")
Pandas supports only gzip
and bz2
in read_csv
:
compression : {‘gzip’, ‘bz2’, ‘infer’, None}, default ‘infer’
For on-the-fly decompression of on-disk data. If ‘infer’, then use gzip or bz2 if filepath_or_buffer is a string ending in ‘.gz’ or ‘.bz2’, respectively, and no decompression otherwise. Set to None for no decompression.
Upvotes: 8