Abhishek
Abhishek

Reputation: 95

Coefficient extraction from linear regression

I want to extract the coefficient "a" each time I execute the linear regression and store it in a table

The first regression will be

 Y = a1X1 + C

where the first regression will be between Dependent Variable "Y" and Independent Variable "X1". The calculated coefficient "a1" after executing the regression will be stored in a table say "T"

The second regression will be

Y = a2X2 + C

where the second regression will be between Dependent Variable "Y" and Independent Variable "X2". The calculated coefficient "a2" after executing the regression will be stored in that same table T

Continue this process till X6 and get six coefficient values in table T

  Y       X1         X2     X3      X4      X5      X6

0.3236% 0.2561% 0.3302% 0.2800% 0.2886% 0.2363% 0.2755%
0.4547% 0.3860% 0.4673% 0.4626% 0.4407% 0.3966% 0.4460%
0.3820% 0.3193% 0.3882% 0.3910% 0.3333% 0.3307% 0.3485%
0.3951% 0.3190% 0.3991% 0.3506% 0.3594% 0.3230% 0.3692%
0.4460% 0.4047% 0.4566% 0.3841% 0.4125% 0.3561% 0.4319%
0.4525% 0.4163% 0.4629% 0.4142% 0.4000% 0.3871% 0.4357%
0.3680% 0.4011% 0.3759% 0.3890% 0.4193% 0.4802% 0.3490%
0.4304% 0.2657% 0.4224% 0.4619% 0.4936% 0.3776% 0.2827%
0.1360% 0.1866% 0.1351% 0.1694% 0.1853% 0.1316% 0.1649%
0.1317% 0.1335% 0.1276% 0.1682% 0.1960% 0.1318% 0.1356%
0.2713% 0.4491% 0.2891% 0.1901% 0.3513% 0.1816% 0.3869%
0.2404% 0.2389% 0.2371% 0.2217% 0.2162% 0.1827% 0.2571%
0.4934% 0.4529% 0.5047% 0.4766% 0.3890% 0.4124% 0.4610%
0.4083% 0.4513% 0.4128% 0.3612% 0.3974% 0.3759% 0.4667%
0.3033% 0.3063% 0.3058% 0.3342% 0.2688% 0.3286% 0.3019%
0.2976% 0.3226% 0.2967% 0.2697% 0.2626% 0.2860% 0.3172%
0.2505% 0.3238% 0.2554% 0.2682% 0.2495% 0.3014% 0.2931%
0.2077% 0.2491% 0.2019% 0.1866% 0.2063% 0.2065% 0.1928%
0.3669% 0.3316% 0.3703% 0.3034% 0.2806% 0.3556% 0.3310%

The code that I have written so far for cleaning the % sign

library(readxl)
Dataset_SR <- read_excel("C:/Users/Abhishek/Desktop/Dataset_SR.xlsx")
View(Dataset_SR)

# Print the structure of the data set
str (Dataset_SR)

# Clean the data of % sign
Dataset_SR[] <- lapply(Dataset_SR, function(x) as.numeric(gsub("%", "", x))) 

View(Dataset_SR)

Upvotes: 0

Views: 70

Answers (1)

akrun
akrun

Reputation: 887911

We can loop through the column names of 'Dataset_SR' except the "Y" column

xcols <- setdiff(names(Dataset_SR), "Y")

Or without grep

xcols <- grep("^X\\d+$", names(Dataset_SR), value = TRUE)

Or using paste

xcols <- paste0("X", 1:6)

with lapply, create formula with reformulate, use that in lm, build the model, and extract the coefficients

lapply(xcols, function(nm) 
      coef(lm(reformulate(nm, "Y"), data = Dataset_SR)))

A reproducible example with the built-in dataset 'mtcars'

lapply(names(mtcars)[8:11],  function(nm)
         coef(lm(reformulate(nm, "mpg"), data = mtcars)))

Upvotes: 2

Related Questions