Reputation: 55
let's say I have a vector of Betas (coefficients of a regression) like this:
> ResultDos$coefficients[-1]
And the class is "Numeric"
Also, I have a Data frame which contains those columns and some more (the name of the columns are variable, that's why I don't use a fixed multiplication)
> head(OutputData,10)
Date OutputTrainData.Dependent
1 2013-01-01 22:00:00 -18
2 2013-01-02 22:00:00 -137
3 2013-01-03 22:00:00 20
4 2013-01-04 22:00:00 48
5 2013-01-07 22:00:00 -36
6 2013-01-08 22:00:00 -17
7 2013-01-09 22:00:00 208
8 2013-01-10 22:00:00 71
9 2013-01-11 22:00:00 39
10 2013-01-14 22:00:00 -76
1 0.4179244
2 0.4179244
3 0.4179244
4 0.4179244
5 0.4179244
6 0.4179244
7 0.4179244
8 0.4179244
9 0.4179244
10 0.4179244
WOE_RetailSalesExAutoMoM.WOE_RetailSalesExAutoMoM WOE_ChangeinHouseholdEmployment
1 0.6000675 -0.8284745
2 0.6000675 -0.8284745
3 0.6000675 -0.8284745
4 0.6000675 0.3242050
5 0.6000675 0.3242050
6 0.6000675 0.3242050
7 0.6000675 0.3242050
8 0.6000675 0.3242050
9 0.6000675 0.3242050
10 0.6000675 0.3242050
WOE_ExistingHomeSalesMoM WOE_PPIExFoodEnergyTradeMoM WOE_PPIFinalDemandYoY WOE_PPIMoM
1 0.6464707 -0.0820543 0.371575 -0.82847453
2 0.6464707 -0.0820543 0.371575 -0.82847453
3 0.6464707 -0.0820543 0.371575 -0.47707664
4 0.6464707 -0.0820543 0.371575 -0.16578655
5 0.6464707 -0.0820543 0.371575 -0.47707664
6 0.6464707 -0.0820543 0.371575 0.09306556
7 0.6464707 -0.0820543 0.371575 0.09306556
8 0.6464707 -0.0820543 0.371575 0.09306556
9 0.6464707 -0.0820543 0.371575 -0.20432022
10 0.6464707 -0.0820543 0.371575 -0.20432022
1 -0.0530457
2 -0.0530457
3 -0.0530457
4 -0.0530457
5 -0.0530457
6 -0.0530457
7 -0.0530457
8 -0.0530457
9 -0.0530457
10 -0.0530457
WOE_ManufacturingSICProduction.WOE_RetailSalesExAutoMoM WOE_PPIMoM.WOE_RetailSalesExAutoMoM
1 -0.4889554 0.64176968
2 -0.4889554 0.64176968
3 -0.4889554 0.36956275
4 -0.4889554 0.12842493
5 -0.4889554 0.36956275
6 -0.4889554 -0.07209233
7 -0.4889554 -0.07209233
8 -0.4889554 -0.07209233
9 -0.4889554 0.15827466
10 -0.4889554 0.15827466
What I would like to do, is to create a new column "Fits" that multiplies the value of the data frame by the value of the Betas, when the names of the column/betas matches. Can anyone help me?
For proof of concept, in an easier way to explain it would be something like this:
Vector: (x1 = 10, x2 = 5, x3 = 1) DF:
Day x3 x2 x1
1 5 3 2
2 2 1 2
3 1 5 3
Day x3 x2 x1 Fits
1 5 3 2 (5*1+3*5+2*10) = 40
2 2 1 2 27
3 1 5 3 56
To solve this, I did the following (not the best solution as I'm new to R / coding):
1.- Get the Betas Vector in order with
Orderlist <- sapply(names(OutputData[-c(1:2)]), function(x) which(x==names(ResultDos$coefficients[-1])))
Orderlist <- as.vector(Orderlist)
BetasInOrder <- as.vector(Betas[Orderlist])
2.- Convert data into a matrix so I could do a Matrix Multiplication.
m <- as.matrix(OutputData[-c(1:2)])
Fits <- m%*%diag(BetasInOrder)
3.- Sum columns and add the intercept
FitsValue <- rowSums(Fits)
FitsValue <- FitsValue + ResultDos$coefficients[1]
Upvotes: 0
Views: 167
Reputation: 6073
Two options: (1) use the predict
command, or (2) do X %*% beta
where you select the correct columns of your data to use in X
using e.g. which
. Note the need for cbind
because of the intercept in the regression.
# example data
df <- data.frame(
x1 = runif(100, 0, 10),
x2 = runif(100, 0, 10),
x3 = runif(100, 0, 10)
df$y <- 2 + 1*df$x1 + 3*df$x3 + rnorm(100, 0, 5)
# run regression of y on x1 and x3 (but not x2)
out <- lm(y ~ x1 + x3, data=df)
# option 1: use predict command
pred1 <- predict(out)
# option 2: use X %*% beta
X <- cbind(1, df[ , names(df) %in% names(out$coefficients)])
pred2 <- as.matrix(X) %*% coef(out)
Upvotes: 2