Reputation: 1569
Hi i'm trying to group a column by values that are closer each other, as example:
column1 column2
322 a
326 b
323 c
323 d
323 e
324 f
325 g
498 h
498 i
495 j
496 k
I want group column1 using values with variance +- 3
Result:
column1 , column2
323 (+-3) a,b,c,d,e,f,g
495 (+-3) h,i,j,k
Upvotes: 1
Views: 100
Reputation: 323396
Sort the value by sort_values
then using diff
and cumsum
create the groupkey
df=df.sort_values('column1')
df.sort_index().\
groupby(df.column1.diff().gt(3).cumsum()).\
agg({'column1':'first','column2':','.join})
column1 column2
column1
0 322 a,b,c,d,e,f,g
1 498 h,i,j,k
Upvotes: 1