Reputation: 893
I have been scratching my head over this. I have two data frames: df
df <- data.frame(group = 1:3,
age = seq(30, 50, length.out = 3),
income = seq(100, 500, length.out = 3),
assets = seq(500, 800, length.out = 3))
and weights
weights <- data.frame(age = 5, income = 10)
I would like to multiply these two data frames only for the same column names. I tried something like this:
colwise(function(x) {x * weights[names(x)]})(df)
but that obviously didn't work as colwise
does not keep the column name inside the function. I looked at various mapply
solutions (example), but I am unable to come up with an answer.
The resulting data.frame
should look like this:
structure(list(group = 1:3, age = c(150, 200, 250), income = c(1000,
3000, 5000), assets = c(500, 650, 800)), .Names = c("group",
"age", "income", "assets"), row.names = c(NA, -3L), class = "data.frame")
group age income assets
1 1 150 1000 500
2 2 200 3000 650
3 3 250 5000 800
Upvotes: 3
Views: 4773
Reputation: 115392
Here is a data.table
solution
library(data.table)
DT <- data.table(df)
W <- data.table(weights)
Use mapply
(or Map
) to calculate the new columns and add then both at once
by reference.
DT <- data.table(df)
W <- data.table(weights)
DT[, `:=`(names(W), Map('*', DT[,names(W), with = F], W)), with = F]
Upvotes: 2
Reputation: 174813
sweep()
is your friend here, for this particular example. It relies upon the names in df
and weights
being in the right order, but that can be arranged.
> nams <- names(weights)
> df[, nams] <- sweep(df[, nams], 2, unlist(weights), "*")
> df
group age income assets
1 1 150 1000 500
2 2 200 3000 650
3 3 250 5000 800
If the variable names in weights
and df
are not in the same order, you can make them so:
> df2 <- data.frame(group = 1:3,
+ age = seq(30, 50, length.out = 3),
+ income = seq(100, 500, length.out = 3),
+ assets = seq(500, 800, length.out = 3))
> nams <- c("age", "income") ## order in df2
> weights2 <- weights[, rev(nams)]
> weights2 ## wrong order compared to df2
income age
1 10 5
> df2[, nams] <- sweep(df2[, nams], 2, unlist(weights2[, nams]), "*")
> df2
group age income assets
1 1 150 1000 500
2 2 200 3000 650
3 3 250 5000 800
In other words we reorder all objects so that age
and income
are in the right order.
Upvotes: 6
Reputation: 2397
You could also do this in a for loop using an index resulting from which(%in%). The above approach is much more efficient but this is an alternative.
results <- list()
for ( i in 1:length(which(names(df) %in% names(weights))) ) {
idx1 <- which(names(df) %in% names(weights))[i]
idx2 <- which(names(weights) %in% names(df))[i]
results[[i]] <- dat[,idx1] * weights[idx2]
}
unlist(results)
Upvotes: 0
Reputation: 193517
Your data:
df <- data.frame(group = 1:3,
age = seq(30, 50, length.out = 3),
income = seq(100, 500, length.out = 3),
assets = seq(500, 800, length.out = 3))
weights <- data.frame(age = 5, income = 10)
The logic:
# Basic name matching looks like this
names(df[names(df) %in% names(weights)])
# [1] "age" "income"
# Use that in `sapply()`
sapply(names(df[names(df) %in% names(weights)]),
function(x) df[[x]] * weights[[x]])
# age income
# [1,] 150 1000
# [2,] 200 3000
# [3,] 250 5000
The implementation:
# Put it all together, replacing the original data
df[names(df) %in% names(weights)] <- sapply(names(df[names(df) %in% names(weights)]),
function(x) df[[x]] * weights[[x]])
The result:
df
# group age income assets
# 1 1 150 1000 500
# 2 2 200 3000 650
# 3 3 250 5000 800
Upvotes: 3
Reputation: 44614
Someone might have a slick way to do it with plyr, but this is probably the most straight forward way in base R.
shared.names <- intersect(names(df), names(weights))
cols <- sapply(names(df), USE.NAMES=TRUE, simplify=FALSE, FUN=function(name)
if (name %in% shared.names) df[[name]] * weights[[name]] else df[[name]])
data.frame(do.call(cbind, cols))
# group age income assets
# 1 1 150 1000 500
# 2 2 200 3000 650
# 3 3 250 5000 800
Upvotes: 4