Reputation: 47
I would like to add a new column to flag rows in a dataframe. The flag is supposed to be done on the following logic:
I think an example might make this more clear. This would be the original data frame.
df = pd.DataFrame(data=[[2, 'fix','receive'], [2, 'float','pay'], [3, 'fix','receive'], [3, 'fix','pay'], [7, 'float','pay'], [7, 'float','receive']], columns=["ID", "Structure","Leg"])
And this should be the result after applying the above logic and creating a new column that flags each row:
df["Flag"] = ["Receive fix, pay float", "Receive fix, pay float", "Receive fix, pay fix","Receive fix, pay fix","Receive float, pay float","Receive float, pay float"]
So, my main questions is how I can loop through the dataframe to find two rows with the same ID and then use information from those two rows to create the same flag for each of the two rows. Thanks a lot for your ideas.
I don't know if this goes into the right direction, but this is my attempt. The problem remains as to how to get the data from the second row that has the same ID.
df["Flag"] = "???"
for index, row in df.iterrows():
if row["Leg"] == "receive":
df.at[index, "Flag"] = row["Leg"] + " " + row["Structure"] + ", pay ?"
Upvotes: 1
Views: 412
Reputation: 862511
First sorting by DataFrame.sort_values
by both columns, then create new column and last use GroupBy.transform
with join
for new column:
df = df.sort_values(['ID','Leg'], ascending=[True, False])
df['new'] = df["Leg"] + " " + df["Structure"]
df["Flag"] = df.groupby('ID')['new'].transform(', '.join)
print (df)
ID Structure Leg Flag new
0 2 fix receive receive fix, pay float receive fix
1 2 float pay receive fix, pay float pay float
2 3 fix receive receive fix, pay fix receive fix
3 3 fix pay receive fix, pay fix pay fix
5 7 float receive receive float, pay float receive float
4 7 float pay receive float, pay float pay float
Solution with helper Series
:
df = df.sort_values(['ID','Leg'], ascending=[True, False])
s = df["Leg"] + " " + df["Structure"]
df["Flag"] = s.groupby(df['ID']).transform(', '.join)
print (df)
ID Structure Leg Flag
0 2 fix receive receive fix, pay float
1 2 float pay receive fix, pay float
2 3 fix receive receive fix, pay fix
3 3 fix pay receive fix, pay fix
5 7 float receive receive float, pay float
4 7 float pay receive float, pay float
Upvotes: 1