Reputation: 47
Not sure why R won't calculate the means of my data correctly. I do have a lot of NA values but R keeps telling me that the mean is NA. Here's an example:
data1=read_excel"pepper.xlsx"
data1$cultivar = as.factor(data1$cultivar)
mean = aggregate(data1[,3:4], list(data1$cultivar), mean)
cultivar | replication | width | height |
---|---|---|---|
BOF | 1 | 12 | 14 |
BOF | 2 | 10 | NA |
BOF | 3 | NA | 15 |
BOF | 4 | NA | NA |
Instead of computing the mean width of BOF
being 11 and the mean height
being 14.5, it computes the means of the height and width as NA. This is an over-simplification of my data. I have several cultivars in my study and calculated the means of each variable of each cultivar using the aggregate
function.
Upvotes: 2
Views: 515
Reputation: 78927
Update: There is no need for anonymous function (Credits to Gregor Thomas, see comments). We could use:
summarise(across(where(is.numeric), mean, na.rm = TRUE))
First answer:
Thanks to Gregor Thomas colMeans
won't work here.
We could use dplyr
package summarise
and across
library(dplyr)
df %>%
group_by(cultivar) %>%
summarise(across(where(is.numeric),~ mean(., na.rm = TRUE)))
Output:
cultivar replication width height
<chr> <dbl> <dbl> <dbl>
1 BOF 2.5 11 14.5
Upvotes: 1
Reputation: 233
Try this:
mean = aggregate(data1[,3:4], list(data1$cultivar), mean, na.rm = TRUE, na.action = na.pass)
Upvotes: 1