Duplicate Columns when using the merge statement

Question

When I try to merge some codes to the code descriptions I get 2 duplicate columns. I start out with this: Table Name: Test

ID     State
1      5
2      2
3      5

and want to merge it with this: Table Name: statecode

StateID State
5       Mass
2       NY

to make a table like this:

ID    State
1     Mass 
2     NY
3     Mass

However, I get a table like this:

ID    State    State
1     5        Mass
2     2        NY
3     5        Mass

I used the merge command like this:

test = merge(x = test, y = statecode, by.x = "State", by.y = "StateID", all.x = T)

Is there a better function other than merge to use in this case? Maybe one to just replace the state code with the state name?

Thank you very much for the help!

Carlos Cinelli · Accepted Answer

You do have to say which column you want to drop, but you can express it concisely using dplyr, for example.

Generating sample data based on yours (but correcting the column names):

test <- read.table(text =
"ID    StateID
1      5
2      2
3      5", header = TRUE)

statecode <- read.table(text =
"
StateID       State
5          Mass
2          NY", header = TRUE)

Using dplyr:

library(dplyr)
test %>% left_join(statecode, by = "StateID") %>% select(-StateID)
  ID State
1  1  Mass
2  2    NY
3  3  Mass

Duplicate Columns when using the merge statement

Answers (2)

Related Questions