mannaroth
mannaroth

Reputation: 1543

Select element from list in Pandas Series based on another column

I have a Pandas DataFrame of the following form:

  Name         Dates        Trigger
  John       [d1,d2,d3]     1
  Mike       [d4]           NaN
  Li         [d1,d4,d5]     2

where the rows in the column Dates are Python lists, where each element in the lists is a DateTime object (e.g. '2019-08-15').

My final goal is to obtain an array with the differences (in days) between the Dates at the index position mentioned in Trigger, resulting in a new column like:

       Date_diff
   [d2-d1,d2-d2,d2-d3]
   [NaN]/d4
   [d5-d1,d5-d4,d5-d5]

No matter what I've tried, I always failed to properly identify the correct element in the list based on the last column. Any suggestions?

Upvotes: 2

Views: 1253

Answers (2)

mannaroth
mannaroth

Reputation: 1543

After handling the NaN's, the following solution works:

df.apply(lambda row: row.Dates[row.Trigger], axis=1)

Upvotes: 2

Lavish Saluja
Lavish Saluja

Reputation: 231

From what I understand you want to take the Trigger as the index in the list of the element from which the other elements get subtracted. I still don't know what you're trying to say for the row corresponding to Mike.

  1. Create a list1 of the third column from your data frame (Trigger)
  2. Create a list2 of the second column from your data frame (Dates)
  3. Create an empty list3 which will be your Dates_Difference column.
  4. Enumerate the list1 with a variable i and iterate through the list2 inside it with a variable j and keep appending list2[i] - list2[j] to list3. Handle the cases when i = NaN accordingly.
  5. Insert list3 as a new column to your data frame with the name Dates_diff

Hope it helps :)

Upvotes: 0

Related Questions