Sandra Young
Sandra Young

Reputation: 95

How to add column information to a Pandas dataframe referring to other rows

I am looking to add information to a dataframe referencing other rows in the dataframe. The dataframe has pairs of scientific nomenclature, going in the ranking hierarchy (species-genus pairs, genus-family pairs and so on). I need to have the taxrank for both the source and target pairs in the same row. I have the target taxrank on the row, but need to search for where the source cell is a target to pull out the appropriate taxrank and add it as a taxrank column. Below is an example of what I have so far:

TAXRANK_target  target  source
45139    order   Salmoniformes   Protacanthopterygii
45140    family  Salmonidae  Salmoniformes
45201    genus   Salmo   Salmonidae
45202    species    labrax   Salmo
45203    species    carpio   Salmo
45204    species    trutta   Salmo
45205    species    letnica  Salmo
45206    species    marmoratus   Salmo
45207    species    fibreni  Salmo

And what I want it to look like:

TAXRANK_target  target  source  TAXRANK_source
45139    order   Salmoniformes   Protacanthopterygii    NaN
45140    family  Salmonidae  Salmoniformes  order
45201    genus   Salmo   Salmonidae family
45202    species    labrax   Salmo  genus
45203    species    carpio   Salmo  genus
45204    species    trutta   Salmo  genus
45205    species    letnica  Salmo  genus
45206    species    marmoratus   Salmo  genus
45207    species    fibreni  Salmo  genus
45208    species    obtusirostris    Salmo  genus

What I cannot figure out is how to deliberately reference one row to impact on a different row.

Upvotes: 1

Views: 59

Answers (1)

jezrael
jezrael

Reputation: 862691

Use Series.map by Series created by DataFrame.set_index:

#if values in target column are not duplicated
s = df.set_index('target')['TAXRANK_target']
#if possible duplicated keep first value only
#s = df.drop_duplicates('target').set_index('target')['TAXRANK_target']
df['TAXRANK_source'] = df['source'].map(s)
print (df)
      TAXRANK_target         target               source TAXRANK_source
45139          order  Salmoniformes  Protacanthopterygii            NaN
45140         family     Salmonidae        Salmoniformes          order
45201          genus          Salmo           Salmonidae         family
45202        species         labrax                Salmo          genus
45203        species         carpio                Salmo          genus
45204        species         trutta                Salmo          genus
45205        species        letnica                Salmo          genus
45206        species     marmoratus                Salmo          genus
45207        species        fibreni                Salmo          genus

Upvotes: 1

Related Questions