Reputation: 25
I have a data set with Air Quality Data. The Data Frame is a matrix of 153 rows and 5 columns. I want to find the mean of the first column in this Data Frame. There are missing values in the column, so I want to exclude those while finding the mean. And finally I want to do that using Control Structures (for loops and if-else loops)
I have tried writing code as seen below. I have created 'y' instead of the actual Air Quality data set to have a reproducible example.
y <- c(1,2,3,NA,5,6,NA,NA,9,10,11,NA,13,NA,15)
x <- matrix(y,nrow=15)
for(i in 1:15){
if(is.na(data.frame[i,1]) == FALSE){
New.Vec <- c(x[i,1])
}
}
print(mean(New.Vec))
I expected the output to be the mean. Though the error I received is this:
Error: object 'New.Vec' not found
Upvotes: 1
Views: 140
Reputation: 492
can't see your data, but probably like this? the vector needed to be initialized. better to avoid loops in R when you can...
myDataFrame <- read.csv("hw1_data.csv")
New.Vec <- c()
for(i in 1:153){
if(!is.na(myDataFrame[i,1])){
New.Vec <- c(New.Vec, myDataFrame[i,1])
}
}
print(mean(New.Vec))
Upvotes: 1
Reputation: 4338
One line of code, no need for for loop.
mean(data.frame$name_of_the_first_column, na.rm = TRUE)
Setting na.rm = TRUE
makes the mean function ignore NA
s.
Upvotes: 3
Reputation: 887691
Here, we can make use of na.aggregate
from zoo
library(zoo)
df1[] <- na.aggregate(df1)
Assuming that 'df1' is a data.frame
with all numeric columns and wanted to fill the NA
elements with the corresponding mean
of that column. na.aggregate
, by default have the fun.aggregate
as mean
Upvotes: 2