Vijayendra
Vijayendra

Reputation: 1949

How to find delta in pandas data-frame rows with specific conditions

I need to compute delta column (as shown below). But tricky part is conditions mentioned below. How can I do this in pandas?

speaker | video | frame | time |delta(expected)
--------|-------|-------|------|----------------
one     |1      | 0     |10    |0
one     |1      | 1     |15    |5
one     |2      | 0     |12    |0
one     |2      | 1     |16    |4
two     |2      | 0     |19    |0
two     |2      | 1     |22    |3
two     |2      | 2     |16    |-6

CONDITIONS: Delta is a difference between frames of same speaker with same video. In other words, delta should not be computed on rows for different speakers or different videos. For those cases the value should be initialized as zero as showed in delta(expected) column.

Upvotes: 2

Views: 2127

Answers (2)

piRSquared
piRSquared

Reputation: 294318

Option 1
Assuming df is sorted by ['speaker', 'video']. If not, then do so.

delta = np.where(
    df.duplicated(['speaker', 'video']).values,
    np.append(0, np.diff(df.time.values)), 0
)

df.assign(delta=delta)

  speaker  video  frame  time  delta(expected)  delta
0     one      1      0    10                0      0
1     one      1      1    15                5      5
2     one      2      0    12                0      0
3     one      2      1    16                4      4
4     two      2      0    19                0      0
5     two      2      1    22                3      3
6     two      2      2    16               -6     -6

Option 2

df.assign(
    delta=df.groupby(['speaker', 'video']).time.transform(
        lambda x: np.append(0, np.diff(x.values))
    )
)

  speaker  video  frame  time  delta(expected)  delta
0     one      1      0    10                0      0
1     one      1      1    15                5      5
2     one      2      0    12                0      0
3     one      2      1    16                4      4
4     two      2      0    19                0      0
5     two      2      1    22                3      3
6     two      2      2    16               -6     -6

Upvotes: 3

Scott Boston
Scott Boston

Reputation: 153460

Let't use groupby, diff, and fillna:

df['delta'] = df.groupby(['speaker','video'])['time'].diff().fillna(0)

Output:

    speaker  video  frame  time  delta(expected)  delta
0  one           1      0    10                0    0.0
1  one           1      1    15                5    5.0
2  one           2      0    12                0    0.0
3  one           2      1    16                4    4.0
4  two           2      0    19                0    0.0
5  two           2      1    22                3    3.0
6  two           2      2    16               -6   -6.0

Upvotes: 3

Related Questions