Reputation: 353
I am confused: loading a csv works oke, that is: No Error and index en columnsnames show but there are no values in my DF. Downloading this csv, converting it to Excel, then load it in Pandas, convert it to csv (pd.to_csv) and load it again as csv works oke. The csv loads as a dataframe.... There must be something in this original csv that I don't understand. In fact my 'problem' is solved by all this converting. But I would like to understand what is wrong / what I have te learn.
So it would be great if someone knows what I am doing wrong here. thanks!
link = 'https://www.vektis.nl/uploads/Docs%20per%20pagina/Open%20Data%20Bestanden/2018/Vektis%20Open%20Databestand%20Zorgverzekeringswet%202018%20-%20postcode3.csv'
df = pd.read_csv(link)
df.shape
(137099, 1)
df.info() looks weird and df.describe() is empty.....
As said: convert original csv to xlsx, load that in pandas and convert is to csv gives a df, with values etc.
df2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 137099 entries, 0 to 137098
Data columns (total 28 columns):
GESLACHT 137098 non-null object
LEEFTIJDSKLASSE 137098 non-null object
POSTCODE_3 137098 non-null float64
AANTAL_BSN 137099 non-null int64
AANTAL_VERZEKERDEJAREN 137099 non-null float64
KOSTEN_MEDISCH_SPECIALISTISCHE_ZORG 137099 non-null float64
KOSTEN_FARMACIE 137099 non-null float64
KOSTEN_SPECIALISTISCHE_GGZ 137099 non-null float64
KOSTEN_HUISARTS_INSCHRIJFTARIEF 137099 non-null float64
KOSTEN_HUISARTS_CONSULT 137099 non-null float64
KOSTEN_HUISARTS_MDZ 137099 non-null float64
KOSTEN_HUISARTS_OVERIG 137099 non-null float64
KOSTEN_HULPMIDDELEN 137099 non-null float64
KOSTEN_MONDZORG 137099 non-null float64
KOSTEN_PARAMEDISCHE_ZORG_FYSIOTHERAPIE 137099 non-null float64
KOSTEN_PARAMEDISCHE_ZORG_OVERIG 137099 non-null float64
KOSTEN_ZIEKENVERVOER_ZITTEND 137099 non-null float64
KOSTEN_ZIEKENVERVOER_LIGGEND 137099 non-null float64
KOSTEN_KRAAMZORG 137099 non-null float64
KOSTEN_VERLOSKUNDIGE_ZORG 137099 non-null float64
KOSTEN_GENERALISTISCHE_BASIS_GGZ 137099 non-null float64
KOSTEN_LANGDURIGE_GGZ 137099 non-null float64
KOSTEN_GRENSOVERSCHRIJDENDE_ZORG 137099 non-null float64
KOSTEN_EERSTELIJNS_ONDERSTEUNING 137099 non-null float64
KOSTEN_GERIATRISCHE_REVALIDATIEZORG 137099 non-null float64
KOSTEN_EERSTELIJNSVERBLIJF 137099 non-null float64
KOSTEN_VERPLEGING_EN_VERZORGING 137099 non-null float64
KOSTEN_OVERIG 137099 non-null float64
dtypes: float64(25), int64(1), object(2)
memory usage: 29.3+ MB
1
Upvotes: 0
Views: 73
Reputation: 7594
You just need to provide a separator, ';'
in your case:
link = 'https://www.vektis.nl/uploads/Docs%20per%20pagina/Open%20Data%20Bestanden/2018/Vektis%20Open%20Databestand%20Zorgverzekeringswet%202018%20-%20postcode3.csv'
df = pd.read_csv(link, sep=';')
print(df)
GESLACHT LEEFTIJDSKLASSE POSTCODE_3 ... KOSTEN_EERSTELIJNSVERBLIJF KOSTEN_VERPLEGING_EN_VERZORGING KOSTEN_OVERIG
0 NaN NaN NaN ... 60376.04 637668.87 496931.54
1 M 0 0.0 ... 0.00 121744.76 890.41
2 M 0 101.0 ... 0.00 565.22 154.32
3 M 0 102.0 ... 0.00 342.72 77.16
4 M 0 103.0 ... 0.00 11192.82 2498.61
... ... ... ... ... ... ... ...
137094 V 90+ 995.0 ... 17126.82 230642.72 0.00
137095 V 90+ 996.0 ... 15504.98 133670.79 0.00
137096 V 90+ 997.0 ... 9608.72 172186.49 0.00
137097 V 90+ 998.0 ... 37083.13 733906.73 1083.82
137098 V 90+ 999.0 ... 26639.36 99737.32 0.00
[137099 rows x 28 columns]
Upvotes: 1