Reputation: 45
I am still doing my first footsteps with R and found SO to be a great tool for learning more and finding answers to my questions. For this one i though did not manage to find any good solution here.
I have a dataframe that can be simplified to this structure:
set.seed(10)
df <- data.frame(v1 = rep(1:2, times=3),
v2 = c("A","B","B","A","B","A"),
v3 = sample(1:6),
xA_1 = sample(1:6),
xA_2 = sample(1:6),
xB_1 = sample(1:6), xB_2 = sample(1:6))
df thus looks like this:
> df
v1 v2 v3 xA_1 xA_2 xB_1 xB_2
1 1 A 4 2 1 3 3
2 2 B 2 6 3 5 4
3 1 B 5 3 2 4 5
4 2 A 3 5 4 2 1
5 1 B 1 4 6 6 2
6 2 A 6 1 5 1 6
I now want R to create a fourth variable, which is dependent on the values of v1 and v2. I achieve this by using the following code:
df <- data.table(df)
df[, v4 := ifelse(v1 == 1 & v2 == "A", v3*xA_1,
ifelse(v1 == 1 & v2 == "B", v3*xB_1,
ifelse(v1 == 2 & v2 == "A", v3*xA_2,
ifelse(v1 == 2 & v2 == "B", v3*xB_2, v3*1))))]
So v4 is created by multiplying v3 with the column that contains the v1 and the v2 value
(e.g. for row 1: v1=1 and v2=A thus multiply v3=4 with xA_1=2 -> 8
).
> df$v4
[1] 8 8 20 12 6 30
Obviuosly, my ifelse approach is tedious when v1 and v2 in fact have many more different values than they have in this example. So I am looking for an efficient way to tell R if v1 == y & v2 == z, multiply v3 with column xy_z
.
I tried writing a for-loop, writing a function that has y and z as index and using the apply function. However none of this worked as wanted.
I appreciate any ideas!
Upvotes: 3
Views: 71
Reputation: 70256
Here's a base R option:
i <- paste0("x", df$v2, "_", df$v1)
df$v4 <- df$v3 * as.numeric(df[cbind(1:nrow(df), match(i, names(df)))])
For the sample data provided below, it creates a column v4
as:
> df$v4
[1] 25 12 2 6 3 10
Or if you want to include the "else" condition to multiply by 1 in case there's no matching column name:
i <- paste0("x", df$v2, "_", df$v1)
tmp <- as.numeric(df[cbind(1:nrow(df), match(i, names(df)))])
df$v4 <- df$v3 * ifelse(is.na(tmp), 1, tmp)
Sample data:
df <- structure(list(v1 = c(1L, 2L, 1L, 2L, 1L, 2L), v2 = structure(c(1L,
2L, 2L, 1L, 2L, 1L), .Label = c("A", "B"), class = "factor"),
v3 = c(5L, 4L, 1L, 6L, 3L, 2L), xA_1 = c(5L, 6L, 3L, 1L,
2L, 4L), xA_2 = c(6L, 4L, 2L, 1L, 3L, 5L), xB_1 = c(4L, 6L,
2L, 5L, 1L, 3L), xB_2 = c(5L, 3L, 2L, 4L, 1L, 6L)), .Names = c("v1",
"v2", "v3", "xA_1", "xA_2", "xB_1", "xB_2"), row.names = c(NA,
-6L), class = "data.frame")
Upvotes: 2
Reputation: 49448
This is a standard "wide" table problem - what you want is harder to do as-is, but easy when the data is "melted":
dt = as.data.table(df)
melt(dt, id.vars = c('v1', 'v2', 'v3'))[variable == paste0('x', v2, '_', v1)
][dt, on = c('v1', 'v2', 'v3'), v3 * value]
#[1] 8 8 20 12 6 30
Upvotes: 2
Reputation: 36
You can try this :
v4 <- c()
for(i in 1:nrow(df)){
col <- paste("x",df$v2[i],"_",df$v1[i],sep="")
v4 <- c(v4,df$v3[i]*df[i,col])
}
df$v4 <- v4
Upvotes: 0