Baum_jammer
Baum_jammer

Reputation: 69

Counting rows between a set of indices in Dataframes

I have a problem with counting the amount of rows between two indices from another Dataframe. Let me explain this in an example:

The index of DF2 is the reference vector and I want to count the amount rows/entries in DF1 that are between the couple indices.

DF1                     DF2
          data                  data    
index                   index                  
   3      1               2       1    
   9      1              11       1    
  15      0              33       1      
  21      0              34       1
  23      0
  30      1
  34      0

Now I want to count all rows that lie between a couple of indices in DF 2. The reference vector is the index vector of DF2: [2, 11, 33, 34]

Between index  2 and 11 of DF2 is index: 3 and 9 of DF1        -> result 2

Between index 11 and 33 of DF2 is index: 15, 21, 23, 30 of DF1 -> result 4

Between index 33 and 34 of DF2 is index: 34 of DF1             -> result 1

Therefore the result vector should be: [2, 4, 1]

It is really struggling, so I hope you can help me.

Upvotes: 0

Views: 88

Answers (1)

Serge Ballesta
Serge Ballesta

Reputation: 148890

If would first build a dataframe giving the min and max indexes from df2:

limits = pd.DataFrame({'mn': np.roll(df2.index, 1), 'mx': df2.index}).iloc[1:]

It gives:

   mn  mx
1   2  11
2  11  33
3  33  34

It is then easy to use a comprehension to get the expected list:

result = [len(df1[(i[0]<=df1.index)&(df1.index<=i[1])]) for i in limits.values]

and obtain as expected:

[2, 4, 1]

Upvotes: 1

Related Questions