Reputation: 93
I've read some threads about apply functions but I'm still struggling with the application. I want to generate a dummy variable in a data frame, that takes on the value 1 if the combination of two variable values exists in an observation of another data frame.
Creation of the two data frames:
df1 <- data.frame(c("A","C","E","F"),
c(17,24,5,8))
names(df1)[1] <- "Apple"
names(df1)[2] <- "Orange"
df1$Apple <- as.character(df1$Apple)
df1$Banana <- 0
df2 <- data.frame(c("Q","A","C","E"),
c(8,303,24,17))
names(df2)[1] <- "Tomato"
names(df2)[2] <- "Cucumber"
df2$Tomato <- as.character(df2$Tomato)
The only observation existing in both data frames is "C", 24 which is in row 2 of df1 and row 3 of df2. I can extract this information, using a for-loop, creating a subset with variable equivalence for the first variable and checking whether an identical value for the 2nd variable exists in the data set:
for(idx in 1:4){
df3 <- subset(df2, Tomato == df1$Apple[idx])
df1$Banana[idx] <- df1$Orange[idx] %in% df3$Cucumber
}
which leads to the desired result:
> df1
Apple Orange Banana
1 A 17 0
2 C 24 1
3 E 5 0
4 F 8 0
However, I'm not able to achieive the same result with the apply function:
Banana <- function(){
df3 <- subset(df2, Tomato == df1$Apple)
df1$Orange %in% df3$Cucumber
}
apply(X = df1, MARGIN = 1, FUN = Banana)
Instead I get the following error message:
Error in FUN(newX[, i], ...) : unused argument (newX[, i])
Does anyone know, what I'm doing wrong here and how to use the function correctly?
Upvotes: 3
Views: 277
Reputation: 388817
One way using apply
is to iterate on df1
row-wise and check if for any row the first value equals Tomato
and second value equals Cucumber
and assign integer value accordingly.
df1$Banana <- as.integer(apply(df1, 1, function(x)
any(x[1] == df2$Tomato & x[2] == df2$Cucumber)))
df1
# Apple Orange Banana
#1 A 17 0
#2 C 24 1
#3 E 5 0
#4 F 8 0
Upvotes: 1