Jacob Odom
Jacob Odom

Reputation: 216

Regression loop in R for data frames

rm(list=ls())
myData <-read.csv(file="C:/Users/Documents/myfile.csv",header=TRUE, sep=",") 
for(i in names(myData))
{
    colNum <- grep(i,colnames(myData)) ##asigns a value to each column 
    if(is.numeric(myData[3,colNum]))  ##if row 3 is numeric, the entire column is 
   {
        ##print(nxeData[,i])        
        fit <- lm(myData[,i] ~ etch_source_Avg, data=myData) #does a regression for each column in my csv file against my independent variable 'etch'
        rsq <- summary(fit)$r.squared   
   }
}

I'm working on doing a regression loop for multiple columns and comparing them against one dependent variable column. I have the majority of the code written, but now I am unsure how to print out my R squared value for each column against the etch_source_Avg parameter while including the name of that column. Ideally it would something look like:

.765 "variable name 1"

.436 "variable name 2" ...and so on

Upvotes: 3

Views: 1891

Answers (1)

Oscar
Oscar

Reputation: 855

here is a quick rewrite of your code, this should give you what you are looking for. Assigning a value of each column is unnecessary since myData should be a data.frame, as such you can access each column with it's column name.

rm(list=ls())
myData <-read.csv(file="C:/Users/Documents/myfile.csv",header=TRUE, sep=",") 
for(i in names(myData))
{ 
    if(is.numeric(myData[3,i]))  ##if row 3 is numeric, the entire column is 
    {       
       fit <- lm(myData[,i] ~ etch_source_Avg, data=myData) #does a regression for each column in my csv file against my independent variable 'etch'
       rsq <- summary(fit)$r.squared
       writelines(paste(rsq,i,"\n"))
    }
}

Hope this helps.

Upvotes: 3

Related Questions