Reputation: 1111
I have a dataframe containing a single NA in the first row of column b:
a <- c(16.54868281, 47.64097026, 51.0773201)
b <- c(NA, 39.40217391, 13.04347826)
c <- c(17.80821918, 42.92237443, 36.75799087)
d <- c(22.90809328, 56.37860082, 61.04252401)
data <- data.frame(cbind(a, b, c, d))
data
a b c d
1 16.54868 NA 17.80822 22.90809
2 47.64097 39.40217 42.92237 56.37860
3 51.07732 13.04348 36.75799 61.04252
Here, I am trying to acquire an average score of each row. But, because of the NA, the first row returns an NA for its mean.
safety <- data.frame(
(data$a + data$b + data$c + data$d) / 4
)
names(safety)[1] <- "safety"
safety
safety
1 NA
2 46.58603
3 40.48033
To resolve this, I have replaced NA to 0. Unfortunately, the computer is treating the missing value as a number, and I am dividing every row by 4. Therefore, I am getting a wrong mean for the first row.
a <- c(16.54868281, 47.64097026, 51.0773201)
b <- c(NA, 39.40217391, 13.04347826)
c <- c(17.80821918, 42.92237443, 36.75799087)
d <- c(22.90809328, 56.37860082, 61.04252401)
data <- data.frame(cbind(a, b, c, d))
data[is.na(data)] <- 0
safety <- data.frame(
(data$a + data$b + data$c + data$d) / 4
)
names(safety)[1] <- "safety"
safety
safety
1 14.31625
2 46.58603
3 40.48033
I need the first row to read 19.08833 instead of 14.31625. Is there a function in R that allows me to divide each row by the number of columns in its equation? I can probably create a long way to solving this issue. But, as the dataset grows bigger, my primitive shortcomings would soon meet its end.
Upvotes: 0
Views: 963
Reputation: 214957
Use rowMeans
with na.rm = TRUE
:
rowMeans(data, na.rm = TRUE)
# [1] 19.08833 46.58603 40.48033
Upvotes: 2