neptunhiker
neptunhiker

Reputation: 47

Flagging Pandas row based on same and other row

I would like to add a new column to flag rows in a dataframe. The flag is supposed to be done on the following logic:

  1. Rows with the same ID belong together and should be flagged in the same way
  2. The flag is constructed from four cells with different combinations of "receive" then "float" or "fixed" followed by ", pay" and then "float" or "fixed"

I think an example might make this more clear. This would be the original data frame.

df = pd.DataFrame(data=[[2, 'fix','receive'], [2, 'float','pay'], [3, 'fix','receive'], [3, 'fix','pay'], [7, 'float','pay'], [7, 'float','receive']], columns=["ID", "Structure","Leg"])

And this should be the result after applying the above logic and creating a new column that flags each row:

df["Flag"] = ["Receive fix, pay float", "Receive fix, pay float", "Receive fix, pay fix","Receive fix, pay fix","Receive float, pay float","Receive float, pay float"]

So, my main questions is how I can loop through the dataframe to find two rows with the same ID and then use information from those two rows to create the same flag for each of the two rows. Thanks a lot for your ideas.

I don't know if this goes into the right direction, but this is my attempt. The problem remains as to how to get the data from the second row that has the same ID.

df["Flag"] = "???"
for index, row in df.iterrows():
    if row["Leg"] == "receive":
        df.at[index, "Flag"] = row["Leg"] + " " + row["Structure"] + ", pay ?"

Upvotes: 1

Views: 412

Answers (1)

jezrael
jezrael

Reputation: 862511

First sorting by DataFrame.sort_values by both columns, then create new column and last use GroupBy.transform with join for new column:

df = df.sort_values(['ID','Leg'], ascending=[True, False])
df['new'] = df["Leg"] + " " + df["Structure"]
df["Flag"] = df.groupby('ID')['new'].transform(', '.join)
print (df)
   ID Structure      Leg                      Flag            new
0   2       fix  receive    receive fix, pay float    receive fix
1   2     float      pay    receive fix, pay float      pay float
2   3       fix  receive      receive fix, pay fix    receive fix
3   3       fix      pay      receive fix, pay fix        pay fix
5   7     float  receive  receive float, pay float  receive float
4   7     float      pay  receive float, pay float      pay float

Solution with helper Series:

df = df.sort_values(['ID','Leg'], ascending=[True, False])
s = df["Leg"] + " " + df["Structure"]
df["Flag"] = s.groupby(df['ID']).transform(', '.join)
print (df)
   ID Structure      Leg                      Flag
0   2       fix  receive    receive fix, pay float
1   2     float      pay    receive fix, pay float
2   3       fix  receive      receive fix, pay fix
3   3       fix      pay      receive fix, pay fix
5   7     float  receive  receive float, pay float
4   7     float      pay  receive float, pay float

Upvotes: 1

Related Questions