Reputation: 6798
I try to read in several csv files with a unfortunate structure, here's a simplified example:
[empty], A, A, B, B
time , X, Y, X, Y
0.0 , 0, 0, 0, 0
1.0 , 2, 5, 7, 0
... , ., ., ., .
...using pandas.read_csv
with the header=[0,1]
argument I can access the values fine:
>>> df = pd.read_csv('file.csv', header=[0,1]'
>>> df.A.X
0 0
1 2
...
But the empty field above the time header results in an ugly Unnamed: 0_level_0
level:
>>> df.columns
MultiIndex(levels=[['Unnamed: 0_level_0', 'A', 'B'], ...
Is there any way to fix this, so I can access the time data with df.Time
again?
EDIT:
This is a snippet of the actual data set:
,,Bone,Bone,Bone
,,Skeleton1_Hip,Skeleton1_Hip,Skeleton1_Hip
,,"1","1","1"
,,Rotation,Rotation,Rotation
Frame,Time,X,Y,Z
0,0.000000,0.009332,0.999247,0.021044
1,0.008333,0.009572,0.999217,0.020468
3,0.016667,0.009871,0.999183,0.019797
(see also: https://gist.github.com/fhaust/25ba612f99420d366f0597b15dbf43e7 for a more complete example)
read via:
pd.read_csv(file, skiprows=2, header=[0,1,3,4], index_col=[1])
I don't really care about the Frame
column, as it's given implicitly with the row index.
Upvotes: 4
Views: 469
Reputation: 863256
Add parameter index_col
for convert first column to index
:
import pandas as pd
temp=u""",A,A,B,B
time,X,Y,X,Y
0.0,0,0,0,0
1.0,2,5,7,0"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), header=[0,1], index_col=[0])
print (df)
A B
time X Y X Y
0.0 0 0 0 0
1.0 2 5 7 0
Or rename column:
df = df.rename(columns={'Unnamed: 0_level_0':'val'})
print (df)
val A B
time X Y X Y
0 0.0 0 0 0 0
1 1.0 2 5 7 0
Upvotes: 1