Taking a cross product of Pandas Dataframes

Question

I am attempting to take string data from one dataframe and substitute it with numerical values and create a cross product of the results like the following below.

I read the data into dataframes, example input coming below:

import pandas as pd
shoppers = pd.DataFrame({'name': ['bill', 'bob', 'james', 'jill', 'henry'], 
                         'item': ['apple','apple','orange','grapes','orange']})

stores = pd.DataFrame({'apple' : [0.25, 0.20, 0.18], 
                       'orange': [0.30, 0.4, 0.35],
                       'grapes': [1.0, 0.9, 1.1], 
                       'store':  ['kroger', 'publix', 'walmart']})

Here's the resulting shoppers dataframe:

         item
name         
bill    apple
bob     apple
james  orange
jill   grapes
henry  orange

And here's the resulting stores dataframe:

         apple  orange  grapes
store                         
kroger    0.25    0.30     1.0
publix    0.20    0.40     0.9
walmart   0.18    0.35     1.1

And the desired result is the price each person would pay at each store. For example:

I'm really struggling to find the right way to make such a transformation in Pandas efficiently. I could easily loop over shoppers and stores and build each row in a brute-force manner, but there must be a more efficient way to do this with the pandas API. Thanks for any suggestions.

Quang Hoang · Accepted Answer

Here's a solution, not cross, but dot:

pd.crosstab(shoppers.index, shoppers['item']).dot(stores.T)

Output:

       kroger  publix  walmart
row_0                         
bill     0.25     0.2     0.18
bob      0.25     0.2     0.18
henry    0.30     0.4     0.35
james    0.30     0.4     0.35
jill     1.00     0.9     1.10

Taking a cross product of Pandas Dataframes

Answers (1)

Related Questions