Reputation: 53
I have a dataframe
such as:
col1 col2 col3 ID
A 23 AZ ER1 ID1
B 12 ZE EZ1 ID2
C 13 RE RE1 ID3
I parsed the ID col in order to get some informations, to be quick, for each ID I get some informations, here is a result of the code:
for i in dataframe['ID']:
name = function(i,ranks=True)
print(name)
{'species': 'rabbit', 'genus': 'unis', 'subfamily': 'logomorphidae', 'family': 'lego', 'no rank': 'info, nothing', 'superkingdom': 'eucoryote'}
{'species': 'dog', 'genus': 'Rana', 'subfamily': 'Alphair', 'family': 'doggidae', 'no rank': 'dsDNA , no stage', 'superkingdom': 'eucaryote'}
{'species': 'duck', 'subfamily': 'duckinae', 'family': 'duckidae'}
...
as you can se it is a dictionary return. As you can also see for the ID 1 and 2 I get 6 informations (species, genus, subfamily, family,no rank,superkingdom)
for the ID 3 I only get 3 informations
And the idea is instead of just print the dic contents to add it directly in the dataframe
and get :
col1 col2 col3 ID species genus subfamily family no rank superkingdom
A 23 AZ ER1 ID1 rabbit unis logomorphidae lego info, nothing, eucaryote
B 12 ZE EZ1 ID2 dog Rana Alphair doggidae dsDNA , no stage eucaryote
C 13 RE RE1 ID3 duck None duckinae duckidae None None
Have you an idea to do it with pandas? Thanks for your help.
Upvotes: 0
Views: 38
Reputation: 59549
Store your output in a dict
of dicts
, making it easy to create a DataFrame
and join it back.
d = {}
for i in dataframe['ID']:
d[i] = taxid.lineage_name(i, ranks=True)
df.merge(pd.DataFrame.from_dict(d, orient='index'), left_on='ID', right_index=True)
col1 col2 col3 ID species genus subfamily family no rank superkingdom
A 23 AZ ER1 ID1 rabbit unis logomorphidae lego info, nothing eucoryote
B 12 ZE EZ1 ID2 dog Rana Alphair doggidae dsDNA , no stage eucaryote
C 13 RE RE1 ID3 duck NaN duckinae duckidae NaN NaN
Upvotes: 1