cool_beans
cool_beans

Reputation: 141

create multiple new dataframes based on an existing data frames column in python

I have a pandas data frame, df, that has 4 columns and a lot of rows.

I want to create 5 different data frames based on the value of one of the columns of the data frame. The column I am referring to is called color.

color has 5 unique values: red, blue, green, yellow, orange.

What I want to do is each of the 5 new data frames should contain all rows which have on of the values in color. For instance df_blue should have all the rows and columns where in the other data frame the value from the color column is blue.

The code I have is the following:

# create 5 new data frames
df_red = []
df_blue= []
df_green= []
df_yellow= []
df_orange= []
for i in range(len(df)):
    if df['color'] == "blue"
       df_blue.append(df)

# i would do if-else statements to satisfy all 5 colors

I feel I am missing some logic...any suggestions or comments?

Thanks!

Upvotes: 0

Views: 871

Answers (2)

cool_beans
cool_beans

Reputation: 141

I ended up doing this for each of the colors.

  blue_data = data[data.color =='blue']

Upvotes: -1

DYZ
DYZ

Reputation: 57033

You need to use groupby. The following code fragment creates a sample DataFrame and converts it into a dictionary where colors are keys and the matching dataframes are values:

df = pd.DataFrame({'color': ['red','blue','red','green','blue'],
                   'foo': [1,2,3,4,5]})
colors = {color: dfc for color,dfc in df.groupby('color')}
#{'blue':   color  foo
#         1  blue    2
#         4  blue    5, 
# 'green':    color  foo
#          3  green    4, 
# 'red':   color  foo
#        0   red    1
#        2   red    3}

Upvotes: 3

Related Questions