yPennylane
yPennylane

Reputation: 772

R: Fill column of a data frame with values of differently sized data frame

Consider the following data frames:

df <- data.frame(x = c("A", "A", "A", "B", "C", "C"),
            y = c("abl", "rtg", "jaf", "rlt", "thk", "lpv"))

z = c(rep("abl", 4), rep("rtg", 2), rep("jaf",1), rep("zfw", 3), "thk")
dat <- data.frame(z = z, group = rep(NA, length(z)))

I want dat$group to be filled with the value of df$x from that row, where the value of df$y matches dat$z. The final data frame should look like that:

 z group
abl     A
abl     A
abl     A
abl     A
rtg     A
rtg     A
jaf     A
zfw    NA
zfw    NA
zfw    NA
thk     C

I just can't figure out how to do this.

The code that I tried so far:

dat$group[which(dat$z == df$y)] <- df$x[which(df$y == dat$z)]
dat$group[which(dat$z %in% df$y)] <- df$x[which(df$y %in% dat$z)]

It is throwing an error and not producing the desired result. How can I get the final data frame?

Upvotes: 0

Views: 36

Answers (2)

divibisan
divibisan

Reputation: 12165

What you're trying to do is a join operation:

dplyr::left_join(dat, df, by = c('z' = 'y'))

     z group    x
1  abl    NA    A
2  abl    NA    A
3  abl    NA    A
4  abl    NA    A
5  rtg    NA    A
6  rtg    NA    A
7  jaf    NA    A
8  zfw    NA <NA>
9  zfw    NA <NA>
10 zfw    NA <NA>
11 thk    NA    C

The linked duplicate will have several different strategies, but I think it would be helpful to know the appropriate term for this kind of operation.

Upvotes: 1

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21432

A simple, R base solution is by using match:

dat$group <- df$x[match(dat$z,df$y)]
dat
     z group
1  abl     A
2  abl     A
3  abl     A
4  abl     A
5  rtg     A
6  rtg     A
7  jaf     A
8  zfw  <NA>
9  zfw  <NA>
10 zfw  <NA>
11 thk     C

Upvotes: 1

Related Questions