Reputation: 69
I have a problem with counting the amount of rows between two indices from another Dataframe. Let me explain this in an example:
The index of DF2 is the reference vector and I want to count the amount rows/entries in DF1 that are between the couple indices.
DF1 DF2
data data
index index
3 1 2 1
9 1 11 1
15 0 33 1
21 0 34 1
23 0
30 1
34 0
Now I want to count all rows that lie between a couple of indices in DF 2. The reference vector is the index vector of DF2: [2, 11, 33, 34]
Between index 2 and 11 of DF2 is index: 3 and 9 of DF1 -> result 2
Between index 11 and 33 of DF2 is index: 15, 21, 23, 30 of DF1 -> result 4
Between index 33 and 34 of DF2 is index: 34 of DF1 -> result 1
Therefore the result vector should be: [2, 4, 1]
It is really struggling, so I hope you can help me.
Upvotes: 0
Views: 88
Reputation: 148890
If would first build a dataframe giving the min and max indexes from df2
:
limits = pd.DataFrame({'mn': np.roll(df2.index, 1), 'mx': df2.index}).iloc[1:]
It gives:
mn mx
1 2 11
2 11 33
3 33 34
It is then easy to use a comprehension to get the expected list:
result = [len(df1[(i[0]<=df1.index)&(df1.index<=i[1])]) for i in limits.values]
and obtain as expected:
[2, 4, 1]
Upvotes: 1