Reputation: 1998
I have current code below that creates a new column based on multiple different values of a column that has different values representing similar things such as Car, Van or Ship, Boat, Submarine that I want all to be classified under the same value in the new column such as Vehicle or Boat.
Code with Simplified Dataset example:
def f(row):
if row['A'] == 'Car':
val = 'Vehicle'
elif row['A'] == 'Van':
val = 'Vehicle'
elif row['Type'] == 'Ship'
val = 'Boat'
elif row['Type'] == 'Scooter'
val = 'Bike'
elif row['Type'] == 'Segway'
val = 'Bike'
return val
What is best method similar to using wildcards rather than type each value out if there are multiple values (30 plus values ) that I want to bucket into the same new values under the new column?
Thanks
Upvotes: 0
Views: 56
Reputation: 22493
One way is to use np.select
with isin
:
df = pd.DataFrame({"Type":["Car","Van","Ship","Scooter","Segway"]})
df["new"] = np.select([df["Type"].isin(["Car","Van"]),
df["Type"].isin(["Scooter","Segway"])],
["Vehicle","Bike"],"Boat")
print (df)
Type new
0 Car Vehicle
1 Van Vehicle
2 Ship Boat
3 Scooter Bike
4 Segway Bike
Upvotes: 2