Reputation: 181
I have a large Dataframe with the following columns:
The data used as the example here can be found here
import pandas
x = pd.read_csv('example1_csv.)
x.head()
ID Year Y
22445 1991 40.0
29925 1991 43.333332
76165 1991 403.0
223725 1991 65.0
280165 1991 690.5312
I want to change the numbers in the column Y
to the categories low
, mid
, high
, where each category is specific to a range of numbers in Y
:
Low
replaces any number within the range of -3000
to 600
in Y
.
Mid
replaces any number within the range of 601
to 1500
in Y
.
High
replaces any number within the range of 1501
to 17000
in Y
.
For example, if an ID
has a Y
value between -3000
and 600
then that ID
will have the numeric value in Y
replaced as Low
.
How does one make these replacements? I have tried several ways but have run into str
and int
type errors every time. The data file used in this question is in the Github link above. Many thanks in advance for the help.
Upvotes: 0
Views: 48
Reputation: 192
This should work too.
x['Y'] = x['Y'].apply(lambda i : 'Low' if i > -3000 and i < 600 else ('Mid' if i >601 and i < 1500 else 'High'))
Upvotes: 1
Reputation: 2417
use numpy.select
import numpy as np
x.Y = np.select([x.Y.lt(601), x.Y.lt(1501), x.Y.lt(17000)], ['Low', 'Mid', 'High'])
Upvotes: 1