Praveen Bushipaka
Praveen Bushipaka

Reputation: 493

comparing a dataframe column with another data frame

I have two datasets

df1 = pd.DataFrame ({"skuid" :("45","22","32","33"), "country": ("A","B","C","A")})
df2 = pd.DataFrame ({"skuid" :("45","32","40","21"),"salesprice" :(10,0,0,30),"regularprice" : (9,10,0,2)}) 

I want to find how many rows does df2 have in common with df1 when country is A (only sum).

I want the output as 1, because skuid 45 is in both datasets and country is A.

I did by subsetting by country and using isin() like

df3 = df1.loc[df1['country']=='A']
df3['skuid'].isin(df2['skuid']).value_counts()

but I want to know whether I can do in single line.

Here what I tried to do in one line code

df1.loc['skuid'].isin(df2.skuid[df1.country.eq('A')].unique().sum()):,])

I know my mistake I'm comparing with df1 with df2 of a country that doesn't exist. So, is there any way where I can do it in one or two lines, without subsetting each country

Thanks in advance

Upvotes: 1

Views: 59

Answers (2)

Mehdi Golzadeh
Mehdi Golzadeh

Reputation: 2583

If I correctly understood you need this:

df3=df1[lambda x: (df1.skuid.isin(df2['skuid'])) & (x['country'] =='A') ].count()

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150745

Let's try:

df1.loc[df1['country']=='A', 'skuid'].isin(df2['skuid']).sum()
# out: 1

Or

(df1['skuid'].isin(df2['skuid']) & df1['country'].eq('A')).sum()

You can also do for all countries with groupby():

df1['skuid'].isin(df2['skuid']).groupby(df1['country']).sum()

Output:

country
A    1
B    0
C    1
Name: skuid, dtype: int64

Upvotes: 1

Related Questions