ambigus9
ambigus9

Reputation: 1569

Group column using variance range in pandas

Hi i'm trying to group a column by values that are closer each other, as example:

column1 column2
322      a
326      b
323      c
323      d
323      e
324      f
325      g
498      h
498      i
495      j
496      k

I want group column1 using values with variance +- 3

Result:

column1 , column2
323 (+-3) a,b,c,d,e,f,g
495 (+-3) h,i,j,k

Upvotes: 1

Views: 100

Answers (1)

BENY
BENY

Reputation: 323396

Sort the value by sort_values then using diff and cumsum create the groupkey

df=df.sort_values('column1')
df.sort_index().\
    groupby(df.column1.diff().gt(3).cumsum()).\
      agg({'column1':'first','column2':','.join})
         column1        column2
column1                        
0            322  a,b,c,d,e,f,g
1            498        h,i,j,k

Upvotes: 1

Related Questions