Reputation: 887
I'm trying to merge some data that I have in two different Data frames
Here are my two data frames
I have a set of a client data in x that has an initials columns that I manually inserted, and another data frame called y,with only the ID & Initials
x has 2959 observations and y has 978 observations, so I don't have all initials for all my clients in data frame x, but those that I do are in data frame y. And in data y, there are some NAs as well.
I want to create a new data frame that has all 2959 observations and has the initials filled in for the clients whose initials I do have in data frame y. Those who are not in data frame y I need to still have them in the final list but just with an NA. x
ID Name Initials AGE
123 Mike NA 18
124 John NA 20
125 Lily NA 21
126 Jasper NA 24
127 Toby NA 27
128 Will NA 19
129 OScar NA 32
~~
~~
y
~~
ID Initials
123 MC
126 TR
127 WO
129 NA
~~
~~
Here is my desired output
ID Name Initials AGE
123 Mike MC 18
124 John NA 20
125 Lily NA 21
126 Jasper NA 24
127 Toby TR 27
128 Will WO 19
129 Oscar NA 32
I tried this, but the output only has 878 observations.
merge_data <- merge(x, y,
by = "ID")
Upvotes: 1
Views: 33
Reputation: 887901
We can use left_join
in dplyr
library(dplyr)
left_join(x %>%
select(-Initials), y, by = 'ID')
In base R
, by default it returns an inner join output, if we need left_join
, specify all.x = TRUE
merge(x, y, all.x = TRUE, by = 'ID')
Upvotes: 1