Peter Chen
Peter Chen

Reputation: 1484

Create columns based on bins

I have a data:

# dt
Column1     
      1
      2
      3
      4
      5
      6
      7
      8
      9

I want to create a new column by bins' average of min and max.

# dt
Column1    Column2
      1          2
      2          2
      3          2
      4          5
      5          5
      6          5
      7          8
      8          8
      9          8

pd.qcut(dt['Column1'], 3)

So column2 = (min of bin + max of bin)/2.

Upvotes: 1

Views: 60

Answers (1)

jezrael
jezrael

Reputation: 863801

Use GroupBy.transform with lambda function for return Series with same size like original DataFrame:

dt['Column2'] = (dt.groupby(pd.qcut(dt['Column1'], 3))['Column1']
                   .transform(lambda x: x.max() + x.min()) / 2)

Or use double transform with add and div:

g = dt.groupby(pd.qcut(dt['Column1'], 3))
dt['Column2'] = g['Column1'].transform('max').add(g['Column1'].transform('min')).div(2)
print (dt)
   Column1  Column2
0        1      2.0
1        2      2.0
2        3      2.0
3        4      5.0
4        5      5.0
5        6      5.0
6        7      8.0
7        8      8.0
8        9      8.0

EDIT:

cols = ['Column1']
for col in cols:
    dt[f'New {col}'] = (dt.groupby(pd.qcut(dt[col], 3))[col]
                       .transform(lambda x: x.max() + x.min()) / 2)
print (dt)
   Column1  New Column1
0        1          2.0
1        2          2.0
2        3          2.0
3        4          5.0
4        5          5.0
5        6          5.0
6        7          8.0
7        8          8.0
8        9          8.0

Upvotes: 1

Related Questions