Rebecca Ijekah
Rebecca Ijekah

Reputation: 449

Creating new dataframes using groupby

I read this (How to create multiple dataframes from pandas groupby object) however, I still do not understand how to create my dataframes for each person after I create my grouped_persons group with groupby.

How to create multiple dataframes from pandas groupby object

What should I change in this code? I think this is part of my problem: 'df_'+ name +'1'

grouped_persons = df.groupby('Person')
for name, group in grouped_persons
    'df_'+ name +'1' = df.loc[(df.Person == name) & (df.ExpNum == 1)]

File "", line 2 for name, group in grouped_persons ^ SyntaxError: invalid syntax

Upvotes: 6

Views: 33097

Answers (3)

jpp
jpp

Reputation: 164683

Use a dictionary for a variable number of variables.

One straightforward solution is to use tuple keys representing ('Person', 'ExpNum') combinations. You can achieve this by feeding a groupby object to tuple and then the result to dict.

Data from @KayWittig.

df = pd.DataFrame([['Tim', 1, 2], ['Tim', 0, 2],
                   ['Claes', 1, 3], ['Claes', 0, 1],
                   ['Emma', 1, 1], ['Emma', 1, 2]],
                  columns=['Person', 'ExpNum', 'Data'])

df_dict = dict(tuple(df.groupby(['Person', 'ExpNum'])))

print(df_dict)

{('Claes', 0):   Person  ExpNum  Data
               3  Claes       0     1,
 ('Claes', 1):   Person  ExpNum  Data
               2  Claes       1     3,
 ('Emma', 1):   Person  ExpNum  Data
               4   Emma       1     1
               5   Emma       1     2,
 ('Tim', 0):   Person  ExpNum  Data
               1    Tim       0     2,
 ('Tim', 1):   Person  ExpNum  Data
               0    Tim       1     2}

Upvotes: 2

Kay Wittig
Kay Wittig

Reputation: 558

Let your DataFrame look like this

df = pd.DataFrame([['Tim', 1, 2],
                   ['Tim', 0, 2],
                   ['Claes', 1, 3],
                   ['Claes', 0, 1],
                   ['Emma', 1, 1],
                   ['Emma', 1, 2]], columns=['Person', 'ExpNum', 'Data'])

giving

>>> df
  Person  ExpNum  Data
0    Tim       1     2
1    Tim       0     2
2  Claes       1     3
3  Claes       0     1
4   Emma       1     1
5   Emma       1     2

Then you will get the group dataframes directly from the pandas groupby object

grouped_persons = df.groupby('Person')

by

>>> grouped_persons.get_group('Emma')
  Person  ExpNum  Data
4   Emma       1     1
5   Emma       1     2

and there is no need to store those separately.

Note: Pandas version used was '0.23.1' but this feature might be available in some earlier versions as well.

Edit: If you are interested in those entries with ExpNum == 1 only, I suggest applying this before the groupby, e.g.

grouped_persons_1 = df[df['ExpNum'] == 1].groupby('Person')

Upvotes: 4

Kavitha Madhavaraj
Kavitha Madhavaraj

Reputation: 592

You can store it in a dictionary like this. I have corrected some syntax errors in your code as well.

    grouped_persons = df.groupby('Person')
    multi_df = {}
    for name, group in grouped_persons:
       multi_df['df_'+ name +'1'] = df[(df.Person == name) & (df.ExpNum == 1)]

Now you can get the stored dataframe back by using multi_df['df_myname_1']

Upvotes: 0

Related Questions