xxx33xxx
xxx33xxx

Reputation: 75

Merge two data sets by character/factor values; keep smaller data set

I have a data set A with a column of character values(factors) and each value appears multiple times. I also have a duplicate of that set (A') which is cleaned (fewer vars and obs). What I try to do now is to merge them in a way that only keeps the rows(records) of the smaller set A'.

I already tried right-joining it but I run into problems because I'm operating on character values.

Info<-c("x","x","x", "y","y","y","z","z","z")
More_info<-c("A", "A","A", "B", "B", "B", "C", "C","C")
Group_A<-cbind(Info, More_info)


vec1<-c("A","B","C")
vec2<-c("one","two","three")
Group_B<-cbind(vec1, vec2)
names(Group_B)<-c("More_Info", "Extra_Info")
x<-right_join(Group_A, Group_B, by= "More_Info")

What I get is: Error in UseMethod("right_join") : no applicable method for 'right_join' applied to an object of class "c('matrix', 'character')"

What I need:

-x-
Info More_Info
A    X
B    Y
C    Z

Upvotes: 1

Views: 174

Answers (1)

Alp Aribal
Alp Aribal

Reputation: 370

You can use merge.

Info <- c("x","x","x", "y","y","y","z","z","z")
More_info <- c("A", "A","A", "B", "B", "B", "C", "C","C")
Group_A <- cbind(Info, More_info)

vec1 <- c("A","B","C")
vec2 <- c("one","two","three")
Group_B <- cbind(vec1, vec2)

# Use colnames to change column names, not names
# Also the 'i' in 'More_Info' should be lower case
colnames(Group_B) <- c("More_info", "Extra_Info")

# Take the unique values of A
merge(unique(Group_A), Group_B, by = "More_info", all.x = F, all.y = T)
#>   More_info Info Extra_Info
#> 1         A    x        one
#> 2         B    y        two
#> 3         C    z      three

Upvotes: 1

Related Questions