MI MA
MI MA

Reputation: 181

Pandas Dataframe Changing values in a column

I have a large Dataframe with the following columns:

The data used as the example here can be found here

import pandas 

x = pd.read_csv('example1_csv.)
x.head()

ID  Year    Y
22445   1991    40.0
29925   1991    43.333332
76165   1991    403.0
223725  1991    65.0
280165  1991    690.5312

I want to change the numbers in the column Y to the categories low, mid, high, where each category is specific to a range of numbers in Y:

  1. Low replaces any number within the range of -3000 to 600 in Y.

  2. Mid replaces any number within the range of 601 to 1500 in Y.

  3. High replaces any number within the range of 1501 to 17000 in Y.

For example, if an ID has a Y value between -3000 and 600 then that ID will have the numeric value in Y replaced as Low.

How does one make these replacements? I have tried several ways but have run into str and int type errors every time. The data file used in this question is in the Github link above. Many thanks in advance for the help.

Upvotes: 0

Views: 48

Answers (2)

Prakash
Prakash

Reputation: 192

This should work too.

x['Y'] = x['Y'].apply(lambda i : 'Low' if i > -3000 and i < 600 else ('Mid' if i >601 and i < 1500 else 'High'))

Upvotes: 1

Ayoub ZAROU
Ayoub ZAROU

Reputation: 2417

use numpy.select

import numpy as np
x.Y = np.select([x.Y.lt(601), x.Y.lt(1501), x.Y.lt(17000)], ['Low', 'Mid', 'High'])

Upvotes: 1

Related Questions