Reputation: 101
I have the following df
SUITE DET
0 HISTO HisRep2
1 HISTO HisRep3
2 NaN store
3 NaN OnHold
4 PMFA Nemato
5 bactmate Nemato
6 bacto1 abbac4
7 abbadabba BhyCPC
8 HAEMFF pmbac3
9 INCLIN Trico
10 BOCOSP isol1
I want to create a third column. The third column will hold a mix of data from the first two columns, using 3 criteria.
I have created code for the 3 criteria individually, but I want to do them simultaneously. My 3 bits are
df['SUITEDET'] = df[df['SUITE'].isnull()]['DET']
which makes df
SUITE DET SUITEDET
0 HISTO HisRep2 NaN
1 HISTO HisRep3 NaN
2 NaN store store
3 NaN OnHold OnHold
4 PMFA Nemato NaN
5 bactmate Nemato NaN
6 bacto1 abbac4 NaN
7 abbadabba BhyCPC NaN
8 HAEMFF pmbac3 NaN
9 INCLIN Trico NaN
10 BOCOSP isol1 NaN
usedet = ['pmbac1','pmbac2','pmbac3','pmbac4','pmbac5','pmbac6','bact1','bact2','bact3','bact4','bact5','bact6','bactAdd1','bactAdd2','bactAdd3','abbac1','abbac2','abbac3','abbac4','abbac5','abbac6','BrachSmear','Brachy','Brachy1','Brachy10','Brachy2','Brachy3','Brachy4','Brachy5','Brachy6','Brachy7','Brachy8','Brachy9','BhyC','isol','salm1','salm2','salm3','salm4','pmavb1','pmavb2','pmavb3','pmavb4','pmavb5','pmavb6','pmavsh','pmavsp','cult','zncult','pmfoBK','pmfood','Coccn','Coccid','CoccidGoat','Strngy','StoidE','Stoide','Nemato','NematE','TrichE','Tricd','Tricm','Trico','Trics','Tricu','Monspp','Fecn','BhyC21','BhyCID','BhyCPC','BhyCPCTrig','BhyClt','isol1','isol1C','isol1F','isol1M','isol1S','isol2','isol2C','isol2F','isol2M','isol2S','isol3','isol3C','isol3F','isol3M','isol3S','isolA','isolB','isolC']
df['SUITEDET'] = df[df['DET'].isin(usedet)]['SUITE']
which creates df
SUITE DET SUITEDET
0 HISTO HisRep2 NaN
1 HISTO HisRep3 NaN
2 NaN store NaN
3 NaN Nemato NaN
4 PMFA BodyWt PMFA
5 bactmate Nemato bactmate
6 bacto1 abbac4 bacto1
7 abbadabba BhyCPC abbadabba
8 HAEMFF pmbac3 HAEMFF
9 INCLIN Trico INCLIN
10 BOCOSP isol1 BOCOSP
usesuite = ["HAEMFF","HAEM4F","INCLIN","BOCOSP","OVCOSP","WECOCF","WECOCA","WECOCP","ECOPCR","BLKWEC","BLKWF","BLKFLK","FLKIND","WECFLK","BHYCULT","BHY21","INCH","LEPTMS","GSH-PX","HISTO","CHLEIAS","CHLEIAG","BVD","BVDANT","BVDAB","CLA","IBRMS","SINFAB","JMILKI","JMILKB","BVDMS","BVDIND","BVDPCR","SBVIMB","SBVIMI","RUMENE","FATLIV","DOWNER","BMMETP","FAMIN","OMETAP","BOVPRO","OPRODP","COPBLK","COPRO","RESPCR"]
df['SUITEDET'] = df[df['SUITE'].isin(usesuite)]['SUITE']
which makes df
SUITE DET SUITEDET
0 HISTO HisRep2 HISTO
1 HISTO HisRep3 HISTO
2 NaN store NaN
3 NaN OnHold NaN
4 PMFA Nemato NaN
5 bactmate Nemato NaN
6 bacto1 abbac4 NaN
7 abbadabba BhyCPC NaN
8 HAEMFF pmbac3 HAEMFF
9 INCLIN Trico INCLIN
10 BOCOSP isol1 BOCOSP
what I want to get out at the end is a df like this, where the third column is completed using the by using the three criteria to select the correct data from the first two columns
SUITE DET SUITEDET
0 HISTO HisRep2 HISTO
1 HISTO HisRep3 HISTO
2 NaN store store
3 NaN OnHold OnHold
4 PMFA Nemato PMFA
5 bactmate Nemato bactmate
6 bacto1 abbac4 bacto1
7 abbadabba BhyCPC abbadabba
8 HAEMFF pmbac3 HAEMFF
9 INCLIN Trico INCLIN
10 BOCOSP isol1 BOCOSP
Should I run the three criteria and create 3dfs and then try and merge them or is there a pythonic way to run the 3 bits together. I don't even need a specific answer if it's a hassle, just an idea of something to go look up. Thanks.
Upvotes: 0
Views: 61
Reputation: 4229
Try:
df['SUITEDET'] = df[df['SUITE'].isnull()]['DET']
condition = df['DET'].isin(usedet) | df['SUITE'].isin(usesuite)
df.loc[condition, 'SUITEDET'] = df['SUITE']
OUTPUT:
SUITE DET SUITEDET
0 HISTO HisRep2 HISTO
1 HISTO HisRep3 HISTO
2 NaN store store
3 NaN OnHold OnHold
4 PMFA Nemato PMFA
5 bactmate Nemato bactmate
6 bacto1 abbac4 bacto1
7 abbadabba BhyCPC abbadabba
8 HAEMFF pmbac3 HAEMFF
9 INCLIN Trico INCLIN
10 BOCOSP isol1 BOCOSP
Solution 2:
condition = df['DET'].isin(usedet) | df['SUITE'].isin(usesuite)
df['SUITEDET'] = np.where(condition, df['SUITE'], np.where(df['SUITE'].isnull(), df['DET'], np.nan))
Upvotes: 2