Reputation: 7245

Assign values to with condition in a pandas dataframe?

I have a pandas dataframe that looks like the following

df
     time   case1   case2   case3
0     5     house   bank     atm
1     3     bank    house  pharmacy
2     10    bank    bank     atm
3     20    house  pharmacy  house

I want to add a column for each case that corresponds to average and standard deviation for the given category.

I know that

p_house = [20, 10]
p_bank =  [5, 1]
p_atm  =  [3, 1]
p_pharmacy = [10, 5]

I would like to have a pandas dataframe like the following

df
     time   case1   case2   case3          p1      p2       p3
0     5     house   bank     atm        [20, 10] [5, 1]   [3, 1]     
1     3     bank    house  pharmacy      [5, 1] [20, 10]  [10, 5]      
2     10    bank    bank     atm         [5, 1]  [5, 1]   [3, 1]
3     15    house  pharmacy  house      [20, 10][10, 5]   [20, 10]

Upvotes: 2

Answers (6)

Mark Wang

Reputation: 2757

Use applymap and dict.get to achieve dataframe level map,

dict_map = {'house' :[20, 10],
            'bank':[5, 1], 
            'atm':[3, 1],
            'pharmacy':[10, 5]}

df.join(df.filter(like='case')
          .applymap(dict_map.get)
          .rename(columns=lambda x:x.replace('case', 'p')))

Upvotes: 1

Serge Ballesta

Reputation: 148880

You could stack, merge with an auxilliary dataframe and unstack:

prob = pd.DataFrame({'case': ['house', 'bank', 'atm', 'pharmacy'],
             'prob':[[20,10], [5,1], [3,1], [10,5]]})
df.set_index('time', append=True).stack().reset_index().merge(
    prob, left_on=0, right_on='case').drop(columns=0).set_index(
    ['level_0', 'time', 'level_2']).unstack()

gives:

               case                          prob                    
level_2       case1     case2     case3     case1     case2     case3
level_0 time                                                         
0       5     house      bank       atm  [20, 10]    [5, 1]    [3, 1]
1       3      bank     house  pharmacy    [5, 1]  [20, 10]   [10, 5]
2       10     bank      bank       atm    [5, 1]    [5, 1]    [3, 1]
3       20    house  pharmacy     house  [20, 10]   [10, 5]  [20, 10]

Upvotes: 1

anky

Reputation: 75080

You can create a dictionary and map the values:

p_house = [20, 10]
p_bank =  [5, 1]
p_atm  =  [3, 1]
p_pharmacy = [10, 5]

d = {'house':p_house,'bank':p_bank,'atm':p_atm,'pharmacy':p_pharmacy}
df.assign(**df.stack().map(d).dropna().unstack().add_prefix('p_'))

   time  case1     case2     case3   p_case1   p_case2   p_case3
0     5  house      bank       atm  [20, 10]    [5, 1]    [3, 1]
1     3   bank     house  pharmacy    [5, 1]  [20, 10]   [10, 5]
2    10   bank      bank       atm    [5, 1]    [5, 1]    [3, 1]
3    20  house  pharmacy     house  [20, 10]   [10, 5]  [20, 10]

Upvotes: 2

Eric Truett

Reputation: 3010

You can use map on a series to add the values. I think this is simpler than the example above.

vals = {'house': [20, 10], 'bank':  [5, 1], 'atm': [3, 1], 'pharmacy': [10, 5]}

for idx, col in enumerate(df.columns[1:]):
    df[f'p{idx+1}'] = df[col].map(vals)

Upvotes: 2

Umar.H

Reputation: 23099

IIUC, you can use a dictionary then use stack and map then rejoin along the time index.

map_dict = {    'house'  : [20, 10]       
 ,   'bank' :  [5, 1]
 ,   'atm'  :  [3, 1]
 ,   'pharmacy' : [10, 5] }

df1 = df.set_index('time').join(df.set_index("time").stack().map(map_dict).unstack()\
    .rename(
    columns=lambda x: x.replace("case", "p")
    ),how='outer')


      case1     case2     case3        p1        p2        p3
time                                                         
5     house      bank       atm  [20, 10]    [5, 1]    [3, 1]
3      bank     house  pharmacy    [5, 1]  [20, 10]   [10, 5]
10     bank      bank       atm    [5, 1]    [5, 1]    [3, 1]
20    house  pharmacy     house  [20, 10]   [10, 5]  [20, 10]

Upvotes: 2

yatu

Reputation: 88226

You're probably better off defining a dictionary, and use the actual values to lookup:

from operator import itemgetter

d = {'house':[20, 10], 'bank':[5, 1], 'atm':[3,1], 'pharmacy':[10,5]}

l = list(zip(*(itemgetter(*l)(d) for l in df.loc[:,'case1':].values)))

for ix,col in enumerate(['p1','p2','p3']):
    df[col] = list(l[ix])

print(df)

   time  case1     case2     case3        p1        p2        p3
0     5  house      bank       atm  [20, 10]    [5, 1]    [3, 1]
1     3   bank     house  pharmacy    [5, 1]  [20, 10]   [10, 5]
2    10   bank      bank       atm    [5, 1]    [5, 1]    [3, 1]
3    20  house  pharmacy     house  [20, 10]   [10, 5]  [20, 10]

Upvotes: 2

Assign values to with condition in a pandas dataframe?

Answers (6)

Related Questions