Reputation: 15
I have a large database that is containing a certain number per customer per type of item. Each day there will be a lot of updates to a certain type of item of a certain customer. The database will look as following:
import pandas as pd
df = pd.DataFrame({'customer' : ['customer1', 'customer2'], 'item1': [12, 13], 'item2' : [3, 28],'item3': [2, 1]})
df2 = pd.DataFrame({'customer' : ['customer1', 'customer2'], 'item?': ['item1', 'item1'], 'quantity' : [2, 5]})
customer item1 item2 item3
0 customer1 12 3 2
1 customer2 13 28 1
customer item? quantity
0 customer1 item1 2
1 customer2 item1 5
The dataframe needs to be updated by df2, where the customer is a string and the item is also a string.
I am expecting the following dataframe:
customer item1 item2 item3
0 customer1 14 3 2
1 customer2 18 28 1
So essentially it is df1+df1, but this could sometimes be minus as well.
I have tried the following:
customerlist = df1['customer'].tolist()
for i in customerlist:
df1.loc[df1.customer == customerlist[i]]
But I am already running into problems. Does someone have a function or whatever that works?
Upvotes: 0
Views: 48
Reputation: 13831
We can use 'customer' as the index
in both dataframes to make sure we align them correctly. Then all we do is add a reshaped version of df2
onto df
based on alignment on index (both rows and columns), and when the item or the customer is a mismatch we use the values from df
:
df.set_index('customer').add(
pd.pivot_table(
df2,index='customer',columns='item?',values='quantity')
).fillna(df.set_index('customer')).astype(int)
prints:
item1 item2 item3
customer
customer1 14 3 2
customer2 18 28 1
I am not sure what you mean by "sometimes be minus as well". If you mean that you want to subtract the two dataframes then use sub
instead of add
. If you mean there might be a minus in the values, that shouldn't affect the code because +
and -
will be a -
Upvotes: 2