Chris90
Chris90

Reputation: 1998

Creating new column based on multiple different values

I have current code below that creates a new column based on multiple different values of a column that has different values representing similar things such as Car, Van or Ship, Boat, Submarine that I want all to be classified under the same value in the new column such as Vehicle or Boat.

Code with Simplified Dataset example:

def f(row):
    if row['A'] == 'Car':
        val = 'Vehicle'
    elif row['A'] == 'Van':
        val = 'Vehicle'
    elif row['Type'] == 'Ship'
        val = 'Boat'
    elif row['Type'] == 'Scooter'
        val = 'Bike'
    elif row['Type'] == 'Segway'
        val = 'Bike'
    return val

What is best method similar to using wildcards rather than type each value out if there are multiple values (30 plus values ) that I want to bucket into the same new values under the new column?

Thanks

Upvotes: 0

Views: 56

Answers (1)

Henry Yik
Henry Yik

Reputation: 22493

One way is to use np.select with isin:

df = pd.DataFrame({"Type":["Car","Van","Ship","Scooter","Segway"]})

df["new"] = np.select([df["Type"].isin(["Car","Van"]),
                       df["Type"].isin(["Scooter","Segway"])],
                      ["Vehicle","Bike"],"Boat")

print (df)

      Type      new
0      Car  Vehicle
1      Van  Vehicle
2     Ship     Boat
3  Scooter     Bike
4   Segway     Bike

Upvotes: 2

Related Questions