Ashreet Sangotra
Ashreet Sangotra

Reputation: 25

Finding Mean of a column in an R Data Set, by using FOR Loops to remove Missing Values

I have a data set with Air Quality Data. The Data Frame is a matrix of 153 rows and 5 columns. I want to find the mean of the first column in this Data Frame. There are missing values in the column, so I want to exclude those while finding the mean. And finally I want to do that using Control Structures (for loops and if-else loops)

I have tried writing code as seen below. I have created 'y' instead of the actual Air Quality data set to have a reproducible example.

y <- c(1,2,3,NA,5,6,NA,NA,9,10,11,NA,13,NA,15)
x <- matrix(y,nrow=15)

for(i in 1:15){
   if(is.na(data.frame[i,1]) == FALSE){
   New.Vec <- c(x[i,1])
   }
}
print(mean(New.Vec))

I expected the output to be the mean. Though the error I received is this:

Error: object 'New.Vec' not found

Upvotes: 1

Views: 140

Answers (3)

David Pedack
David Pedack

Reputation: 492

can't see your data, but probably like this? the vector needed to be initialized. better to avoid loops in R when you can...

myDataFrame <- read.csv("hw1_data.csv")

New.Vec <- c()    
for(i in 1:153){
   if(!is.na(myDataFrame[i,1])){
      New.Vec <- c(New.Vec, myDataFrame[i,1])
   }
}
print(mean(New.Vec))

Upvotes: 1

Ben G
Ben G

Reputation: 4338

One line of code, no need for for loop.

mean(data.frame$name_of_the_first_column, na.rm = TRUE)

Setting na.rm = TRUE makes the mean function ignore NAs.

Upvotes: 3

akrun
akrun

Reputation: 887691

Here, we can make use of na.aggregate from zoo

library(zoo)
df1[] <- na.aggregate(df1)

Assuming that 'df1' is a data.frame with all numeric columns and wanted to fill the NA elements with the corresponding mean of that column. na.aggregate, by default have the fun.aggregate as mean

Upvotes: 2

Related Questions