Reputation: 387
I have two data frame df1 and df2
df1 has following data (N Rows)
Time(s) sv-01 sv-02 sv-03 Val1 val2 val3
1339.4 1 4 12 1.6 0.6 1.3
1340.4 1 12 4 -0.5 0.5 1.4
1341.4 1 6 8 0.4 5 1.6
1342.4 2 5 14 1.2 3.9 11
...... ..... .... ... ..
df2 has following data which has more rows than df1
Time(msec) channel svid value-1 value-2 valu-03
1000 1 2 0 5 1
1000 2 5 1 4 2
1000 3 2 3 4 7
..... .....................................
1339400 1 1 1.6 0.4 5.3
1339400 2 12 0.5 1.8 -4.4
1339400 3 4 -0.20 1.6 -7.9
1340400 1 1 0.3 0.3 1.5
1340400 2 6 2.3 -4.3 1.0
1340400 3 4 2.0 1.1 -0.45
1341400 1 1 2 2.1 0
1341400 2 8 3.4 -0.3 1
1341400 3 6 0 4.1 2.3
.... .... .. ... ... ...
What I am trying to achieve is
1.first multiplying Time(s) column by 1000 so that it matches with df2 millisecond column.
2.In df1 sv 01,02 and 03 are in independent column but those sv are present in same column under svid.
So goal is when time of df1(after changing) is matching with time of df2 copy next three consecutive lines i.e copy all matched lines of that time instant.
Basically I want to iterate the time of df1 in df2 time column and if there is a match copy three next rows and copy to a new df.
I have seen examples using pandas merge function but in my case both have different header.
Thanks.
Upvotes: 0
Views: 126
Reputation: 863301
I think you need double boolean indexing
- first df2
with isin
, for multiple is used mul
:
And then count values per groups by cumcount
and filter first 3:
df = df2[df2['Time(msec)'].isin(df1['Time(s)'].mul(1000))]
df = df[df.groupby('Time(msec)').cumcount() < 3]
print (df)
Time(msec) channel svid value-1 value-2 valu-03
3 1339400 1 1 1.6 0.4 5.30
4 1339400 2 12 0.5 1.8 -4.40
5 1339400 3 4 -0.2 1.6 -7.90
6 1340400 1 1 0.3 0.3 1.50
7 1340400 2 6 2.3 -4.3 1.00
8 1340400 3 4 2.0 1.1 -0.45
9 1341400 1 1 2.0 2.1 0.00
10 1341400 2 8 3.4 -0.3 1.00
11 1341400 3 6 0.0 4.1 2.30
Detail:
print (df.groupby('Time(msec)').cumcount())
3 0
4 1
5 2
6 0
7 1
8 2
9 0
10 1
11 2
dtype: int64
Upvotes: 1