Reputation: 33
I have a dataframe like the following: Multi-index dataframe by columns
I would like to get 3 dataframes named like each columns (compass, accel, gyro) with the timeindex untouched, and three columns each(df1, df2, df3).
I've tried
for index,row in df.iterrows():
but couldnt really got it to work
And I was thinking in somenthing stack()
and unstack()
but don't really know how.
Upvotes: 3
Views: 3048
Reputation: 59549
groupby
allows you to split the DataFrame along a MultiIndex level with the same level_values. We will use DataFrame.xs
to remove the grouping Index level, leaving you with only the columns you care about. Separate DataFrames are stored in a dictionary, keyed by the unique level-1 values of the original column MultiIndex.
import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame(np.random.randint(1, 10, (4, 9)),
columns=pd.MultiIndex.from_product([['df1', 'df2', 'df3'],
['compass', 'gyro', 'accel']]))
# df1 df2 df3
# compass gyro accel compass gyro accel compass gyro accel
#0 3 3 7 2 4 7 2 1 2
#1 1 1 4 5 1 1 5 2 8
#2 4 3 5 8 3 5 9 1 8
#3 4 5 7 2 6 7 3 2 9
d = {idx: gp.xs(idx, level=1, axis=1) for idx,gp in df.groupby(level=1, axis=1)}
d['gyro']
# df1 df2 df3
#0 3 4 1
#1 1 1 2
#2 3 3 1
#3 5 6 2
As such splits are readily available with a groupby
you may not even need to store the separate DataFrames; you can manipulate each of them separately with GroupBy.apply
.
Upvotes: 2
Reputation: 60
You can save the 3 first columns in a csv file, and repeat the process more 2 times to the others csv files...
You can select the 3 columns to your dataframe like this:
x = 0
data=pd.read_csv(file.csv, keep_default_na=False, skiprows=line_header, na_filter=False, usecols=[x,x+1,x+2])[[compass, accel, gyro]])
where x = your first column of the "big dataframe"
the usecols property is really useful in this case
You can read more about in: Pandas.read_csv
Upvotes: 1