Cansu_Mac
Cansu_Mac

Reputation: 3

Combining two pandas dataframes based on column AND row VALUES

First, I haven't found this asked before - probably because I'm not using the right words to ask it. So if it has been asked, please send me in that direction.

How can I combine two pandas data frames based on column AND row. My main dataframe has a column 'years' and a column 'county' among others. Ideally, I want to add another column 'percent' from the second data frame below.

For example, I have this image of my first df:

main df

and I have another data frame with the same 'year' column and every other column name is a string value in the original "main" dataframe's 'county' column:

second df

How can I combine these two data frames in a way that adds another column to the 'main df'? It would be helpful to first put the second data frame in the format where there are three columns: 'year', 'county', and 'percent'. If anyone can help me with this part, I can merge it.

Upvotes: 0

Views: 70

Answers (1)

benji
benji

Reputation: 386

I think what you will want to do is transform the second dataframe to have a row for each year/county combination and then you can use a left join to combine the two. I believe the ```melt`` method will do this transformation. Try this:

melted_second_df = second_df.melt(id_vars=["year"], var_name="county", value_name="percent")
combined_df = first_df.merge(
    right=melted_second_df,
    on=["year", "county"],
    how="left"
)

Upvotes: 1

Related Questions