Mainland
Mainland

Reputation: 4564

Create a new column based on the variable in a existing column

I have a data frame with variable column. The column has different variables and some have common sizes and other have unique sizes. I want to create new column based on the variable column

df = 
      variable
0     A1  
1     A2
2     B1
3     B2
4     C
5     A1
6     D 
7     A1  
8     A2
9     B1
#I want to create a new column `size` indicating the size of the variable. 
# A1, A2 = 20
# B1, B2 = 10
# C = 5, D = 2

My approach1

df['size'] = ""
df.loc[df['variable'].isin([A1,A2])==True,'size']=20
df.loc[df['variable'].isin([B1,B2])==True,'size']=10
df.loc[df['variable'].isin([C])==True,'size']=5
df.loc[df['variable'].isin([D])==True,'size']=2

My approach2

size_list = [['A1',20],['A2',20],['B1',10],['B2',10],['C',5],['D',2]]
for itm in size_list:
   df.loc[df['variable'].isin([itm[0])==True,'size']=itm[1]

The first approach is 4 lines and vectorized approach. The second approach is just two lines but a for loop. Which approach should I consider? Is there a much better approach?

Upvotes: 1

Views: 130

Answers (1)

jezrael
jezrael

Reputation: 862511

Use Series.map with dictionary created from your list for mapping:

size_list = [['A1',20],['A2',20],['B1',10],['B2',10],['C',5],['D',2]]

df['size'] = df['variable'].map(dict(size_list))
print (df)
  variable  size
0       A1    20
1       A2    20
2       B1    10
3       B2    10
4        C     5
5       A1    20
6        D     2
7       A1    20
8       A2    20
9       B1    10

Upvotes: 1

Related Questions