Reputation: 1949
I need to compute delta column (as shown below). But tricky part is conditions mentioned below. How can I do this in pandas?
speaker | video | frame | time |delta(expected) --------|-------|-------|------|---------------- one |1 | 0 |10 |0 one |1 | 1 |15 |5 one |2 | 0 |12 |0 one |2 | 1 |16 |4 two |2 | 0 |19 |0 two |2 | 1 |22 |3 two |2 | 2 |16 |-6
CONDITIONS: Delta is a difference between frames of same speaker with same video. In other words, delta should not be computed on rows for different speakers or different videos. For those cases the value should be initialized as zero as showed in delta(expected) column.
Upvotes: 2
Views: 2127
Reputation: 294318
Option 1
Assuming df
is sorted by ['speaker', 'video']
. If not, then do so.
delta = np.where(
df.duplicated(['speaker', 'video']).values,
np.append(0, np.diff(df.time.values)), 0
)
df.assign(delta=delta)
speaker video frame time delta(expected) delta
0 one 1 0 10 0 0
1 one 1 1 15 5 5
2 one 2 0 12 0 0
3 one 2 1 16 4 4
4 two 2 0 19 0 0
5 two 2 1 22 3 3
6 two 2 2 16 -6 -6
Option 2
df.assign(
delta=df.groupby(['speaker', 'video']).time.transform(
lambda x: np.append(0, np.diff(x.values))
)
)
speaker video frame time delta(expected) delta
0 one 1 0 10 0 0
1 one 1 1 15 5 5
2 one 2 0 12 0 0
3 one 2 1 16 4 4
4 two 2 0 19 0 0
5 two 2 1 22 3 3
6 two 2 2 16 -6 -6
Upvotes: 3
Reputation: 153460
Let't use groupby
, diff
, and fillna
:
df['delta'] = df.groupby(['speaker','video'])['time'].diff().fillna(0)
Output:
speaker video frame time delta(expected) delta
0 one 1 0 10 0 0.0
1 one 1 1 15 5 5.0
2 one 2 0 12 0 0.0
3 one 2 1 16 4 4.0
4 two 2 0 19 0 0.0
5 two 2 1 22 3 3.0
6 two 2 2 16 -6 -6.0
Upvotes: 3