Tamil Selvan
Tamil Selvan

Reputation: 1749

How to give label on pandas.cut() when a a value does not meet any boundaries

I have a dataframe with numerical continuous values, I want to convert them into an ordinal value as a categorical feature. At the same time, when there is a numerical value that does not meet the boundaries, it is retuning as NaN. But I want to assign a new label for those values.

My dataframe:

          a
0       200
1  10000000
2     60000
3      5000
4         2
5    700000

Here is what is tried:

df = pd.DataFrame({'a':[200,10000000,60000,5000,2,700000]})
bins = [0, 100, 1000, 10000, 50000, 100000, 1000000]
labels = [1, 2, 3, 4, 5, 6]
binned_out = pd.cut(df['a'], bins=bins, labels=labels)

binned_out output:

0      2
1    NaN
2      5
3      3
4      1
5      6
Name: a, dtype: category
Categories (6, int64): [1 < 2 < 3 < 4 < 5 < 6]

Expected Output by retruning values NaN as 0:

0      2
1      0
2      5
3      3
4      1
5      6

Upvotes: 2

Views: 1584

Answers (1)

jezrael
jezrael

Reputation: 862691

Use cat.add_categories with Series.fillna:

binned_out = pd.cut(df['a'], bins=bins, labels=labels).cat.add_categories([0]).fillna(0)
print (binned_out)
0    2
1    0
2    5
3    3
4    1
5    6
Name: a, dtype: category
Categories (7, int64): [1 < 2 < 3 < 4 < 5 < 6 < 0]

Upvotes: 3

Related Questions