user8652079
user8652079

Reputation: 55

populate a dataframe column based on a list

I have a dataframe

vehicle_make    vehicle_model   vehicle_year
Toyota  Corolla 2016
Hyundai Sonata  2016
Cadillac    DTS 2006
Toyota  Prius   2014
Kia Optima  2015

I want to add a new column 'vehicle_make_category' which populates based on a list i have

luxury=['Bentley',
'Maserati',
'Hummer',
'Porsche',
'Lexus']

non_luxury=['Saab',
'Mazda',
'Dodge',
'Volkswagen',
'Kia',
'Chevrolet',
'Hyundai',
'Ford',
'Nissan',
'Honda',
'Toyota'
]

How can accomplish this? I have tried using

df['vehicle_make_category']=np.where(df['vehicle_make']=i for i in luxury, 'luxury')

but it doesnt work...

Upvotes: 2

Views: 3639

Answers (5)

harpan
harpan

Reputation: 8631

You can create a lookup_df from the lists for non_luxury and luxury.

lookup_df = pd.DataFrame({
    'vehicle_make': luxury + non_luxury,
    'vehicl_make_category': (["luxury"] * len(luxury))+(["non_luxury"] * len(non_luxury))
})

Then left join on the original df that you have.

df.merge(lookup_df, how='left',left_on='vehicle_make', right_on='vehicle_make') 

Output:

    vehicle_make    vehicle_model   vehicle_year    vehicle_make_category
0   Toyota               Corolla      2016             non_luxury
1   Hyundai               Sonata      2016             non_luxury
2   Cadillac               DTS        2006               NaN
3   Toyota                Prius       2014             non_luxury
4   Kia                   Optima      2015             non_luxury

Upvotes: 0

TrentWoodbury
TrentWoodbury

Reputation: 911

You can create the column via list comprehension:

df['vehicle_make_category'] = [
    'luxury' if row.vehicle_make in luxury
    else 'non_luxury'
    for _, row in df.iterrows()
]

Upvotes: 0

DJK
DJK

Reputation: 9264

Use isin and also add a condition to np.where that fills the gaps for a condition not evaluated as true

df['vehicle_make_category'] = np.where(df.vehicle_make.isin(luxury),'luxury','non-luxury')

  vehicle_make vehicle_model  vehicle_year vehicle_make_category
0       Toyota       Corolla          2016            non-luxury
1      Hyundai        Sonata          2016            non-luxury
2     Cadillac           DTS          2006            non-luxury
3       Toyota         Prius          2014            non-luxury
4          Kia        Optima          2015            non-luxury

Using np.select we can create a conditions list and assign values based on a condition being true

conditions = [df.vehicle_make.isin(luxury),df.vehicle_make.isin(non_luxury)]

df['vehicle_make_category'] = np.select(conditions,['luxury','non-luxury'],default='no-category')

  vehicle_make vehicle_model  vehicle_year vehicle_make_category
0       Toyota       Corolla          2016            non-luxury
1      Hyundai        Sonata          2016            non-luxury
2     Cadillac           DTS          2006           no-category
3       Toyota         Prius          2014            non-luxury
4          Kia        Optima          2015            non-luxury

Upvotes: 1

rafaelc
rafaelc

Reputation: 59274

Simply

df["vehicle_make_category"] = None
df.loc[df["vehicle_make"].isin(luxury), "vehicle_make_category"] = "luxury"
df.loc[df["vehicle_make"].isin(non_luxury), "vehicle_make_category"] = "non_luxury"

Upvotes: 3

Mohammad Athar
Mohammad Athar

Reputation: 1980

You can us df.join

You'll have to make a new dataframe identifying luxury/nonluxury.

veh = ['toyota','hyundai','cadillac']
yr = [2016,2016,2016]
lux = ['non','non','lux']

#recreating your lux/non layout
n_lux = [veh[0],veh[1]]
lux = [veh[2]]

#then making a new column
b = ['non' if v in n_lux else 'lux' for v in veh]


A = pd.DataFrame(np.array([veh,yr]).T)
B =pd.DataFrame(np.array([veh,b]).T)
pd.concat([A,B],axis = 1, keys = [0])

Upvotes: 0

Related Questions