dper
dper

Reputation: 904

Combine two dataframes in pandas

I have 2 dataframes :

df:

portfolio  symbol  id  var1  var2  var3 

df1:

symbol  sector  market  count 

I want to add the columns sector and market from df1 to df. df1 has uniques values for symbol and hence a smaller dataframe than df which is the original dataframe.

I tried doing :

pd.merge(df,df1,on='symbol',how='outer')

But the output is extending rows than desired. Can anyone help as to what is missed out here.

Thanks

Upvotes: 1

Views: 78

Answers (3)

dper
dper

Reputation: 904

My apologies, I didn't realise that outer join would also create rows for the second dataframe values if not available in the first dataframe. that is the reason why I was getting extra rows, to remove that I added df7 = df.dropna(subset=['symbol'])

Upvotes: 1

Dnorious
Dnorious

Reputation: 55

If you do an outer join, the amount of rows will be the amount of rows the longer column of the two (symbol column) has and thus the one from df. If you only want the amount of unique symbol values you should use an inner join.

Upvotes: 1

NYC Coder
NYC Coder

Reputation: 7594

Have you tried doing an inner join,

df.merge(df1, on='symbol', how='inner')

Upvotes: 2

Related Questions