Conditional Statements for 2 Pandas Dataframes

I've been having troubles solving the following problem:

I have 2 dataframes in pandas, the first dataframe (df1) is about the fruits I have bought over an entire year, the next dataframe (df2) is about the pricing of the fruits. Now the problem consists that some of the prices for the fruits in the df1 are incorrect. Here's the example:

df1 = pd.DataFrame({'Fruits Bought Over a Year' : ['Apple', 'Apple', 'Apple', 'Apple', 'Orange', 'Watermelon',
                                      'Orange', 'Orange', 'Watermelon', 'Apple', 'Grapes', 
                                      'Watermelon', 'Grapes'],
       'Price': [5, 5, 4, 4.5, 3, 7, 2.5, 2, 6, 4, 1, 6, 2]})


df2 = pd.DataFrame({'Fruits' : ['Apple', 'Orange', 'Watermelon', 'Grapes'],
       'Fruits Price': [5, 3, 7, 2]})

What I want to do is to create a formula that can evaluate that if (for example) the apple price in df1 is different than the price in df2, then tell me the difference between the real price in df2 and the incorrect price in df1 and put it in as a new column in df1.

Example: df1 apple costed 6 and the real price in df2 the apple costs only 5, now the difference would be 1, and it will appear as a new column called 'price difference' in df1.

Upvotes: 0

Views: 42

Answers (1)

Sachin Kharude
Sachin Kharude

Reputation: 314

You can try the below solution to get the results:

  1. set the index of the df2 as fruits column
df2 = df2.set_index('Fruits')
  1. convert df2 to the dictionary
di = df2.to_dict('index')
  1. use the apply method to take the difference using the above dictionary.
df1['price difference'] = df1.apply(lambda row: abs(df1['Price'][row.name] - di[df1['Fruits Bought Over a Year'][row.name]]['Fruits Price']), axis=1)

In the 3rd step, we are taking a difference using the dictionary we created in step 2.

Upvotes: 0

Related Questions