its.kcl
its.kcl

Reputation: 103

How can I subset dataframe and put them on a list?

I'm looking for a more automated approach to subset this dataframe by rank and put them in a list. Because if there happens to be 150 ranks I can't do individual subsets.

ID    |  GROUP   |  RANK
1     |    A     |    1
2     |    B     |    2
3     |    C     |    3
2     |    A     |    1
2     |    E     |    2
2     |    G     |    3

How can I subset the dataframe by Rank and then put every subset in a list? (Not using group by) I know how to individually subset them but I'm not sure how I can do this if there's more ranks.

Output:

ranks = [df1,df2,df3....and so on]

Upvotes: 0

Views: 153

Answers (1)

rafaelc
rafaelc

Reputation: 59274

Just use groupby directly in a list comprehension

>>> [df for rank, df in df.groupby('RANK')]

This will generate a list of dataframes, each a sub-dataframe related to the corresponding rank.

You can also do a dict comprehension:

>>> dic = {rank: df for rank, df in df.groupby('RANK')}

such that you can access your df via dic[1] for rank == 1.


In more detail, pd.DataFrame.groupby is a method that returns a DataFrameGroupBy object. A DataFrameGroupBy object is an iterable, which means you can iterate over it with a for loop. This iterable generates tuples with two vales, where the first is whatever you used to group (in this case, an integer rank), and the second, the sub dataframe.

Upvotes: 1

Related Questions