tiru
tiru

Reputation: 363

How divide the values and place in next column according to conditions in pandas

enter image description here

print (df)
      Q Name Region username
15        RF  India  Karthik
12  INTERNET    NaN     Paul
9   INTERNET  India      Raj
10  INTERNET  India      Ram
11  INTERNET  China      Xin
13     TOOLS  china     Zang
14     TOOLS  china     chin

Above is the dataframe I want OUTPUT should be If all memmbers are india then put YEs in ALLINDIA If memmbers contains atleast one india then put YEs in AtleastoneINDIA If memmbers does not contain atleast one india then put YEs in non_INDia

enter image description here

Upvotes: 0

Views: 20

Answers (1)

jezrael
jezrael

Reputation: 863176

Compare by eq (==) for boolean mask and aggregate by agg and functions all any, then create new column by invert any function per rows and last add_suffix:

df1 = (df['Region'].eq('India')
                   .groupby(df['Q Name'])
                   .agg(['all','any'])
                   .assign(non= lambda x: ~x.any(axis=1))
                   .add_suffix('_india'))
print (df1)
          all_india  any_india  non_india
Q Name                                   
INTERNET      False       True      False
RF             True       True      False
TOOLS         False      False       True

Also is possible small modification for change Trues of any if also in all column:

df1 = (df['Region'].eq('India')
                   .groupby(df['Q Name'])
                   .agg(['all','any'])
                   .assign(non= lambda x: ~x.any(axis=1),
                           any = lambda x: x['any'] & ~x['all'])
                   .add_suffix('_india'))
print (df1)
          all_india  any_india  non_india
Q Name                                   
INTERNET      False       True      False
RF             True      False      False
TOOLS         False      False       True

Last for yes values use numpy.where:

df2 = pd.DataFrame(np.where(df1, 'yes', ''), 
                   index=df1.index, 
                   columns=df1.columns)
print (df2)
         all_india any_india non_india
Q Name                                
INTERNET                 yes          
RF             yes                    
TOOLS                              yes

Upvotes: 2

Related Questions