Pandas updating certain values in large database but not matching dataframe size

Question

I have a large database that is containing a certain number per customer per type of item. Each day there will be a lot of updates to a certain type of item of a certain customer. The database will look as following:

import pandas as pd

df = pd.DataFrame({'customer' :  ['customer1', 'customer2'], 'item1': [12, 13], 'item2' : [3, 28],'item3': [2, 1]})
df2 = pd.DataFrame({'customer' :  ['customer1', 'customer2'], 'item?': ['item1', 'item1'], 'quantity' : [2, 5]})

    customer  item1  item2  item3
0  customer1     12      3      2
1  customer2     13     28      1

    customer  item?  quantity
0  customer1  item1         2
1  customer2  item1         5

The dataframe needs to be updated by df2, where the customer is a string and the item is also a string.

I am expecting the following dataframe:

    customer  item1  item2  item3
0  customer1     14      3      2
1  customer2     18     28      1

So essentially it is df1+df1, but this could sometimes be minus as well.

I have tried the following:

customerlist = df1['customer'].tolist()
for i in customerlist:
    df1.loc[df1.customer == customerlist[i]]

But I am already running into problems. Does someone have a function or whatever that works?

Pandas updating certain values in large database but not matching dataframe size

Answers (1)

Related Questions