datascientist
datascientist

Reputation: 65

Filling a dataframe column with values from another column, based on values from a third column

I have a pandas dataframe like the following. I created the last 3 columns based on unique values from the column RefIDPrefix.

RefIDPrefix RefIDNumber GO PMID Reactome
GO 12345
PMID 23456
Reactome 34567
GO 45678
GO 56789
PMID 67890

I want to fill the last 3 columns like the following. Basically, based on the value in RefIDPrefix, I want to take the value in RefIDNumber and put it in the correct column corresponding to RefIDPrefix.

RefIDPrefix RefIDNumber GO PMID Reactome
GO 12345 12345
PMID 23456 23456
Reactome 34567 34567
GO 45678 45678
GO 56789 56789
PMID 67890 67890

I have been trying to do this for a while but haven't been able to figure out how to go about it. Any help would be appreciated!

Upvotes: 2

Views: 187

Answers (1)

SeaBean
SeaBean

Reputation: 23227

You can use df.pivot() to build the columns from RefIDPrefix and .join() it back to the original df

df.join(df.pivot(columns='RefIDPrefix', values='RefIDNumber').fillna(''))

Output:

  RefIDPrefix  RefIDNumber       GO     PMID Reactome
0          GO        12345  12345.0                  
1        PMID        23456           23456.0         
2    Reactome        34567                    34567.0
3          GO        45678  45678.0                  
4          GO        56789  56789.0                  
5        PMID        67890           67890.0         

Edit

For the display format of the numbers in new columns (currently displayed like float number with decimal point), if your RefIDNumber column is actually in string, the numbers in new column will also be in string and have no decimal point (like integers).

However, if RefIDNumber is in numeric format (most probably in positive numbers for ID numbers), we can retain the numbers as integer by fine-tuning the .fillna() part, as follows:

df.join(df.pivot(columns='RefIDPrefix', values='RefIDNumber').fillna(-1, downcast='infer').replace(-1, ''))

Output:

  RefIDPrefix  RefIDNumber     GO   PMID Reactome
0          GO        12345  12345                
1        PMID        23456         23456         
2    Reactome        34567                  34567
3          GO        45678  45678                
4          GO        56789  56789                
5        PMID        67890         67890         

Upvotes: 3

Related Questions