Reputation: 103
I have two dataframes of different lengths and different columns, but a shared column with the same identifying data. They look like this
observations DF:
index | scientific_name | park_name | observations |
---|---|---|---|
0 | name1 | park1 | 10 |
1 | name2 | park2 | 12 |
species DF:
index | scientific_name | common_names | category |
---|---|---|---|
0 | name1 | name1,name2 | Mammal |
1 | name2 | name1,name2 | Vascular plant |
I am trying to create a new column in the observatiosn DF called 'category' that is filled with data based on the shared scientific_names between both tables. I've tried using pd.merge but it doesn't fill the category column the way I want. Concat does not either. When i tried using a list comprehension it gave me a value error too. Any thoughts?
I tried using a list comprehension like so:
observations['category'] = [el for el in species['category'] if observations['scientific_name'] == species['scientific_name]]
This results in an error.
Upvotes: 1
Views: 47
Reputation: 1770
If you only wanted to add the "category" column from species
to observations
based on the shared column "scientific_name", this should work.
observations = pd.merge(observations, species[['scientific_name', 'category']])
Upvotes: 1