Reputation: 1177
I am working on R in R studio. I need to calculate the mean for each column of a data frame.
cluster1 // 5 by 4 data frame
mean(cluster1) //
I got :
Warning message:
In mean.default(cluster1) :
argument is not numeric or logical: returning NA
But I can use
mean(cluster1[[1]])
to get the mean of the first column.
How to get means for all columns ?
Any help would be appreciated.
Upvotes: 73
Views: 307147
Reputation: 41603
Another option using the function fmean
from the collapse package. Here is a reproducible example:
set.seed(1)
m <- data.frame(matrix(sample(100, 20, replace = TRUE), ncol = 4))
library(collapse)
fmean(m)
Output:
X1 X2 X3 X4
47.0 64.4 44.8 67.8
Upvotes: 1
Reputation: 53
colMeans(A, na.rm = FALSE, dims = 1)
https://stat.ethz.ch/R-manual/R-devel/library/base/html/colSums.html
This is in the base class, so no library is required.
The first answer looks like it is using colMeans from the analytics library which is not available in the R version 4.0.2.
Upvotes: 1
Reputation: 2863
class(mtcars)
my.mean <- unlist(lapply(mtcars, mean)); my.mean
mpg cyl disp hp drat wt qsec vs
20.090625 6.187500 230.721875 146.687500 3.596563 3.217250 17.848750 0.437500
am gear carb
0.406250 3.687500 2.812500
Upvotes: 2
Reputation: 23
try it ! also can calculate NA's data!
df <- data.frame(a1=1:10, a2=11:20)
df %>% summarise_each(funs( mean( .,na.rm = TRUE)))
# a1 a2
# 5.5 15.5
Upvotes: 2
Reputation: 5694
For diversity: Another way is to converts a vector function to one that works with data
frames by using plyr::colwise()
set.seed(1)
m <- data.frame(matrix(sample(100, 20, replace = TRUE), ncol = 4))
plyr::colwise(mean)(m)
# X1 X2 X3 X4
# 1 47 64.4 44.8 67.8
Upvotes: 0
Reputation: 191
In case you have NA's:
sapply(data, mean, na.rm = T) # Returns a vector (with names)
lapply(data, mean, na.rm = T) # Returns a list
Remember that "mean" needs numeric data. If you have mixed class data, then use:
numdata<-data[sapply(data, is.numeric)]
sapply(numdata, mean, na.rm = T) # Returns a vector
lapply(numdata, mean, na.rm = T) # Returns a list
Upvotes: 13
Reputation:
Another way is to use purrr package
# example data like what is said above
@A Handcart And Mohair
set.seed(1)
m <- data.frame(matrix(sample(100, 20, replace = TRUE), ncol = 4))
library(purrr)
means <- map_dbl(m, mean)
> means
# X1 X2 X3 X4
#47.0 64.4 44.8 67.8
Upvotes: 2
Reputation: 311
You can use 'apply' to run a function or the rows or columns of a matrix or numerical data frame:
cluster1 <- data.frame(a=1:5, b=11:15, c=21:25, d=31:35)
apply(cluster1,2,mean) # applies function 'mean' to 2nd dimension (columns)
apply(cluster1,1,mean) # applies function to 1st dimension (rows)
sapply(cluster1, mean) # also takes mean of columns, treating data frame like list of vectors
Upvotes: 31
Reputation: 193687
You can use colMeans
:
### Sample data
set.seed(1)
m <- data.frame(matrix(sample(100, 20, replace = TRUE), ncol = 4))
### Your error
mean(m)
# [1] NA
# Warning message:
# In mean.default(m) : argument is not numeric or logical: returning NA
### The result using `colMeans`
colMeans(m)
# X1 X2 X3 X4
# 47.0 64.4 44.8 67.8
Upvotes: 91