Reputation: 289
I have the following question and I need help to apply the for loop to iterate through dataframe columns with unique values. For ex I have the following df.
col1 col2 col3
aaa 10 1
bbb 15 2
aaa 12 1
bbb 16 3
ccc 20 3
ccc 50 1
ddd 18 2
I had to apply some manipulation to the dataset for each unique value of col3. Therefore, what I did is I sliced out the df with col3=1 by:
df1 = df[df['col3']==1]
#added all processing here in df1#
Now I need to do the same slicing for col3==2 ... col3==10, and I will be applying the same manipulation as I did in col3==1. For ex I have to do:
df2 = df[df['col3']==2]
#add the same processing here in df2#
df3 = df[df['col3']==3]
#add the same processing here in df3#
Then I will need to append them into a list and then combine them at the end. I couldn't figure out how to run a for loop that will go through col3 column and look at the unique values so I don't have to create manually ten dfs. I tried to groupby then apply the manipulation but it didn't work. I appreciate help on this. Thanks
Upvotes: 1
Views: 4994
Reputation: 138
simple solution. just iterate on the unique values of this column and loc the rows with this unique value. like this:
dfs=[]
for i in df["col3"].unique():
df_i = df.loc[df["Cluster"]==i,:]
dfs.append(df_i.copy())
Upvotes: 8
Reputation: 1731
This should do it but will be slow for large dataframes.
df1 = pd.DataFrame(columns=['col1', 'col2', 'col3'])
df2 = pd.DataFrame(columns=['col1', 'col2', 'col3'])
df3 = pd.DataFrame(columns=['col1', 'col2', 'col3'])
for _, v in df.iterrows():
if v[2] == 1:
# add your code
df1 = df1.append(v)
elif v[2] == 2:
# add your code
df2 = df2.append(v)
elif v[2] == 3:
# add your code
df3 = df3.append(v)
You can then use pd.concat()
to rebuild to one df.
Output of df1
col1 col2 col3
0 aaa 10 1
2 aaa 12 1
5 ccc 50 1
Upvotes: 0