Reputation: 4498
I am trying to create a new dataframes df_A, df_B and df_C from an existing dataframe df based on categorical values in the column category (A,B and C).
This doesn't work
df_A = {n: df.ix[rows]
for n, rows in enumerate(df.groupby('Category').groups)}
Here I get the error "Key Error: A"
(Note: A is one of the categories)
This doesn't work either
df_A = np.where(df['Category']=='A')).copy()
Here I get the error: "syntax error"
Finally, this doesn't work
df_A = np.where(raw[raw['Category']=='A']).copy()
"AttributeError: 'tuple' object has no attribute 'copy'"
Thank You
Upvotes: 2
Views: 1474
Reputation: 863631
It seems you need first boolean indexing
because Category
is column, not index
if need dictionary :
df2 = {n: data[ data['Category'] == rows]
for n, rows in enumerate(data.groupby('Category').groups)}
Or try remove groups
:
df2 = {n: rows[1] for n, rows in enumerate(data.groupby('Category'))}
Sample:
data = pd.DataFrame({'Category':['A','A','D'],
'B':[4,5,6],
'C':[7,8,9]})
print (data)
B C Category
0 4 7 A
1 5 8 A
2 6 9 D
df2 = {n: rows[1] for n, rows in enumerate(data.groupby('Category'))}
print (df2)
{0: B C Category
0 4 7 A
1 5 8 A, 1: B C Category
2 6 9 D}
df2 = {n: data[ data['Category'] == rows]
for n, rows in enumerate(data.groupby('Category').groups)}
print (df2)
{0: B C Category
0 4 7 A
1 5 8 A, 1: B C Category
2 6 9 D}
Solution without groupby
df2 = {n: data[data['Category'] == rows] for n, rows in enumerate(data['Category'].unique())}
print (df2)
{0: B C Category
0 4 7 A
1 5 8 A, 1: B C Category
2 6 9 D}
print (df2[0])
B C Category
0 4 7 A
1 5 8 A
But if need select dict of DataFrame
by Category
value:
dfs = {n: rows for n, rows in data.groupby('Category')}
print (dfs)
{'A': B C Category
0 4 7 A
1 5 8 A, 'D': B C Category
2 6 9 D}
print (dfs['A'])
B C Category
0 4 7 A
1 5 8 A
Upvotes: 1