shrey tyagi
shrey tyagi

Reputation: 154

Applying a key value dictionary pair to multiple column in a dataframe

I need to apply changes into current dataframe in 2 columns with a dictionary key to one column and value to another column for each value coming from dict. I am not able to find any way to do that for eg.

dict is - test = {'a':32, 'b':21, 'c':92}

current dataframe looks forms like

date         env   result         cost

2021-03-01   dev      gcp.dev.a         30
2021-03-01   prd      gcp.prd.d         35
2021-03-01   dev      gcp.dev.j         98
2021-03-01   sandbox  gcp.sandbox.b     94

after adding changes from the dictionary on result and cost column dataframe should look like below -

date         env        result          cost

2021-03-01   dev      gcp.dev.a           30
2021-03-01   prd      gcp.prd.d           35
2021-03-01   dev      gcp.dev.j           98
2021-03-01   sandbox  gcp.sandbox.b       94
2021-03-01   dev      gcp.dev.a           32 
2021-03-01   prd      gcp.prd.b           21  
2021-03-01   prd     gcp.prd.c          92

here gcp is a fixed value, that is being added and dev and prd is being coming from env column.

above you notice last 3 rows are being added by taking test dictionary key values each being added in result and cost. i.e relevant row should be added in both the columns for each key value pair

Upvotes: 3

Views: 1121

Answers (3)

Anurag Dabas
Anurag Dabas

Reputation: 24324

You can also do this by using pd.DateFrame.from_dict() method,append() method and ffill() method:

test = {'a':32, 'b':21, 'c':92}

newdf=pd.DataFrame(test.values(),index=test.keys(),columns=['cost']).reset_index()
    #OR(use any one of them to create dataframe named newdf)
newdf=pd.DataFrame.from_dict(test,orient='index',columns=['cost']).reset_index().rename(columns={'index':'result'})

Finally:

newdf=df.append(newdf,ignore_index=True).ffill()

Now If you print newdf you will get your desired output:

    date         result     cost
0   2021-03-01      a       30
1   2021-03-01      d       35
2   2021-03-01      j       98
3   2021-03-01      b       94
4   2021-03-01      a       32
5   2021-03-01      b       21
6   2021-03-01      c       92

Upvotes: 3

sammywemmy
sammywemmy

Reputation: 28709

Set result and cost as index:

temp = df.set_index(['result', 'cost']) 

Build MultiIndex from dictionary:

test_index = pd.MultiIndex.from_tuples(test.items(), names = ['result', 'cost'])

Reindex temp with the union of temp's index and test_index:

(temp
 .reindex(temp.index.union(test_index, sort = False))
 .ffill() 
 .reset_index()
 .reindex(columns = df.columns)
 )
 
         date result  cost
0  2021-03-01      a    30
1  2021-03-01      d    35
2  2021-03-01      j    98
3  2021-03-01      b    94
4  2021-03-01      a    32
5  2021-03-01      b    21
6  2021-03-01      c    92

Upvotes: 0

jezrael
jezrael

Reputation: 863226

Create new DataFrame by cosntructor and add it to original by concat, last forward filling date values:

test  = {'a':32, 'b':21, 'c':92}

df1 = pd.DataFrame(list(test.items()), columns=['result','cost'])

df = pd.concat([df, df1], ignore_index=True)
df['date'] = df['date'].ffill()
print (df)
         date result  cost
0  2021-03-01      a    30
1  2021-03-01      d    35
2  2021-03-01      j    98
3  2021-03-01      b    94
4  2021-03-01      a    32
5  2021-03-01      b    21
6  2021-03-01      c    92

Loops solution is possible, but slow, so not recommended:

for k, v in test.items():
    df.loc[len(df), ['result','cost']] = (k, v)

df['date'] = df['date'].ffill()

Upvotes: 1

Related Questions