Reputation: 39
Here is the df:
# A tibble: 6 x 5
t a b c d
<dbl> <dbl> <dbl> <dbl> <dbl>
1 3999. 0.00586 0.00986 0.00728 0.00856
2 3998. 0.0057 0.00958 0.00702 0.00827
3 3997. 0.00580 0.00962 0.00711 0.00839
4 3996. 0.00602 0.00993 0.00726 0.00875
I want to get means for an all rows except to not include the first column. The code I wrote:
df$Mean <- rowMeans(df[select(df, -"t")])
The error I get:
Error: Must subset columns with a valid subscript vector.
x Subscript `select(group1, -"t")` has the wrong type `tbl_df<
p2 : double
p8 : double
p10: double
p9 : double
>`.
ℹ It must be logical, numeric, or character.
I tried to convert df to matrix, but then I get another error. How should I solve this?
Now I'm trying to calculate standard error using the code:
se <- function(x){sd(df[,x])/sqrt(length(df[,x]))}
sapply(group1[,2:5],se)
I try to indicate which columns should be used to calculate the error, but again an error pops up:
Error: Must subset columns with a valid subscript vector.
x Can't convert from `x` <double> to <integer> due to loss of precision.
I have used valid column subscripts, so I don't know why the error.
Upvotes: 1
Views: 578
Reputation: 886938
We can use setdiff
to return the columns that are not 't' and then get the rowMeans
. This assumes that the column 't' can be anywhere and not based on the position of the column
df$Mean <- rowMeans(df[setdiff(names(df), "t")], na.rm = TRUE)
df
# t a b c d Mean
#1 3999 0.00586 0.00986 0.00728 0.00856 0.0078900
#2 3998 0.00570 0.00958 0.00702 0.00827 0.0076425
#3 3997 0.00580 0.00962 0.00711 0.00839 0.0077300
#4 3996 0.00602 0.00993 0.00726 0.00875 0.0079900
select
from dplyr
returns the subset of data.frame and not the column names or index. So, we can directly apply rowMeans
library(dplyr)
rowMeans(select(df, -t), na.rm = TRUE)
Or in a pipe
df <- df %>%
mutate(Mean = rowMeans(select(., -t), na.rm = TRUE))
If we need to get the standard error per row, we can use apply
with MARGIN
as 1
apply(df[setdiff(names(df), 't')], 1,
function(x) sd(x)/sqrt(length(x)))
Or with rowSds
from matrixStats
library(matrixStats)
rowSds(as.matrix(df[setdiff(names(df), 't')]))/sqrt(ncol(df)-1)
df <- structure(list(t = c(3999, 3998, 3997, 3996), a = c(0.00586,
0.0057, 0.0058, 0.00602), b = c(0.00986, 0.00958, 0.00962, 0.00993
), c = c(0.00728, 0.00702, 0.00711, 0.00726), d = c(0.00856,
0.00827, 0.00839, 0.00875)), class = "data.frame", row.names = c("1",
"2", "3", "4"))
Upvotes: 0
Reputation: 39585
A similar base R
solution would be:
df$Mean <- rowMeans(df[,-1],na.rm=T)
Output:
t a b c d Mean
1 3999 0.00586 0.00986 0.00728 0.00856 0.0078900
2 3998 0.00570 0.00958 0.00702 0.00827 0.0076425
3 3997 0.00580 0.00962 0.00711 0.00839 0.0077300
4 3996 0.00602 0.00993 0.00726 0.00875 0.0079900
Upvotes: 1