rafvasq
rafvasq

Reputation: 1510

Split up column based on range of values

I want to find the indexes where a new range of 100 values begins.

In the case below, since the first row is 0, the next index would be the next number above 100 (7). At index 7, the value is 104, so the next index would be next number above 204 (15). At index 15, the value is 205, so the next index would be the next number above 305 (n/a).

Therefore the output would be [0, 7, 15].

0           0
1           0
2           4
3           10
4           30
5           65
6           92
7           104
8           108
9           109
10          123
11          132
12          153
13          160
14          190
15          205
16          207
17          210
18          240
19          254
20          254
21          254
22          263
23          273
24          280
25          293

Upvotes: 1

Views: 304

Answers (2)

Divakar
Divakar

Reputation: 221624

For sorted data, we can use searchsorted -

In [98]: df.head()
Out[98]: 
    A
0   0
1   0
2   4
3  10
4  30

In [143]: df.A.searchsorted(np.arange(0,df.A.iloc[-1],100))
Out[143]: array([ 0,  7, 15])

If you need based on dataframe/series index, index it by df.index -

In [101]: df.index[_]
Out[101]: Int64Index([0, 7, 15], dtype='int64')

Upvotes: 1

YOLO
YOLO

Reputation: 21739

You can do zfill to create three digit numbers:

# convert number to string
df['grp'] = df['b'].astype(str).str.zfill(3).str[0]
print(df)

     a    b grp
0    0    0   0
1    1    0   0
2    2    4   0
3    3   10   0
4    4   30   0
5    5   65   0
6    6   92   0
7    7  104   1
8    8  108   1
9    9  109   1
10  10  123   1
11  11  132   1
12  12  153   1
13  13  160   1
14  14  190   1
15  15  205   2

# get first row from each group
ix = df.groupby('grp').first()['a'].to_numpy()
print(ix)    

array([ 0,  7, 15])

Upvotes: 3

Related Questions