Take chunks of data from pandas df based on repeated values ranges in a column

Question

I want to select data from the pandas dataframe based on the repeated range of values in a particular column. For example, I've df below.

range_col col1 col2
  1        a   b
  2        c   d
  3        e   f
  1        m   n
  2        o   p
  3        r   s
  1        e   f
  2        j   k
  3        l   m

In above df, I would like to iterate thru chunk of data by selecting ranges 1 2 3 from range_col. So first chunk will be

range_col col1 col2
  1        a   b
  2        c   d
  3        e   f

Second will be

range_col col1 col2
  1        m   n
  2        o   p
  3        r   s

etc...

How can I iterate on df like this?

Dani Mesejo · Accepted Answer

You could do:

for _, group in df.groupby(df['range_col'] - df.index, sort=False):
    print(group)
    print()

Output

    range_col col1 col2
0          1    a    b
1          2    c    d
2          3    e    f

   range_col col1 col2
3          1    m    n
4          2    o    p
5          3    r    s

   range_col col1 col2
6          1    e    f
7          2    j    k
8          3    l    m

This is based on an old recipe from itertools, see here.

Take chunks of data from pandas df based on repeated values ranges in a column

Answers (2)

Related Questions