Kayla
Kayla

Reputation: 59

How to avoid repeating values when merging data frames?

I have two data frames:

df1:  
Year       Name1    Value1
1 1967     Fallow 47.2730
2 1967 Cultivated 52.5090
3 1967  Grassland 57.5399
4 1967  Shrubland 61.3711
5 1967   Woodland 62.1911
6 1960   Fallow-w 42.2146

and df2: 
 Year       Name2   Value2
1 1967     Fallow 47.2718
2 1967 Cultivated 52.4988
3 1967  Grassland 56.8066
4 1967  Shrubland 59.3636
5 1967   Woodland 56.3803
6 1967   Fallow-w 42.1898

I want to merge the two dfs, they are the same except for the Values. I did this:

df_all = merge(df1, df2, by = 'Year') 

but it keeps only duplicating row values for one frame and ending like this:

  Year    Name1 Value1      Name2 Value2
1 1967  Fallow  47.273     Fallow 47.2718
2 1967  Fallow  47.273 Cultivated 52.4988
3 1967  Fallow  47.273  Grassland 56.8066
4 1967  Fallow  47.273  Shrubland 59.3636
5 1967  Fallow  47.273   Woodland 56.3803
6 1967  Fallow  47.273   Fallow-w 42.1898

What am I doing wrong?

Upvotes: 0

Views: 728

Answers (2)

alb_alb
alb_alb

Reputation: 58

if both df are exactly the same except for the values you can try:

df_all=merge(df1, df2, by=c("Year","Name"))

Please be aware that if Name1 and Name2 have the same name (Name above) it's easier.

Upvotes: 1

L. South
L. South

Reputation: 151

If I understand your question correctly, I think you need a join: left_join(df1, df2, by = c('Year', 'Name1'= 'Name2'))

enter image description here

This will match values by Year and Name. Your merge is essentially doing a join only on Year not name, which is why it's reproducing those columns to the right.

Upvotes: 1

Related Questions