pyds_learner
pyds_learner

Reputation: 519

compare two column values and create 2 more columns based on comparison

I have a pandas dataframe, i would require a column value to be checked if its available in another column values and create 2 more columns based on it.

My Dataframe looks like this :

Dept-x  Country Age Category    Dept-y
Math    India   Young   Good    Math,Social,English,Science,French
Math    India   Adult   Good    Math,Social,English,Science,French
Social  Aus     Young   Average Science,Math,Social, English, French
Science Pak     Young   Good    Math,Social,English,Practical,French
Science Pak     Adult   Average Math,Social,Science,French,English
Science Pak     Adult   Good    Science,Math,Social, English, French

Expected Dataframe :

Top 3   Top all
1             1
1             1
0             1
0             0
1             1
1             1

So, in the expected dataframe in addition to the existing columns we would need to add two columns (i.e) Top 3 and Top all.

If the value of Dept-x is available in First 3 values of Dept-y then both Top 3 and Top all should have value as 1.

If the value of Dept-x is not available in First 3 values but exists in Dept-y then Top 3 should be 0 and Top all should be 1.

If the value of Dept-x is not at all available in Dept-y then both Top 3 and Top all should have value a 0.

I would appreciate if some one can help me to achieve this ?

Upvotes: 1

Views: 69

Answers (2)

Daniel Scott
Daniel Scott

Reputation: 985

I would try something like this?

df['Top 3'] = 0
df['Top all'] = 0
df.loc[df['Dept-x'] in list(df['Dept-y']),'Top all']=1
df.loc[df['Dept-x'] in list(df['Dept-y'])[:3],'Top 3','Top all']=1

Upvotes: -1

Ricky Kim
Ricky Kim

Reputation: 2022

You could use list comprehension like this:

df['Top 3']=[1 if x in y.split(',')[:3] else 0 for x,y in zip(df['Dept-x'], df['Dept-y'])]
df['Top all']=[1 if x in y else 0 for x,y in zip(df['Dept-x'], df['Dept-y'])]

Upvotes: 3

Related Questions