Reputation: 23
I have 3 data sets, each with variables time_tick
, gyr_X_value
, gyr_Y_value
, and gyr_Z_value
.
An example of one of the data sets is as follows:
time_tick gyr_X_value gyr_Y_value gyr_Z_value
1 .01 .12 .24 -.28
2 .12 0 0 .05
3 .04 .10 0 .17
4 .03 0 -.25 .15
I know that I can calculate the variance of the each individual data set with var()
, but how can I calculate the variance of gyr_X_value
across all three data sets?
Upvotes: 2
Views: 1512
Reputation: 17790
For those kinds of problems, I strongly recommend the tidyverse approach.
Your data:
df <- read.table(text = "time_tick gyr_X_value gyr_Y_value gyr_Z_value
1 .01 .12 .24 -.28
2 .12 0 0 .05
3 .04 .10 0 .17
4 .03 0 -.25 .15", header = TRUE)
The calculation:
library(tidyverse)
df %>% gather(variable, value, -time_tick) %>%
group_by(variable) %>%
summarize(variance = var(value))
## A tibble: 3 x 2
# variable variance
# <chr> <dbl>
#1 gyr_X_value 0.004100
#2 gyr_Y_value 0.040025
#3 gyr_Z_value 0.043425
Explanation: First, the gather function turns your wide data frame into a long one:
df %>% gather(variable, value, -time_tick)
# time_tick variable value
#1 0.01 gyr_X_value 0.12
#2 0.12 gyr_X_value 0.00
#3 0.04 gyr_X_value 0.10
#4 0.03 gyr_X_value 0.00
#5 0.01 gyr_Y_value 0.24
#6 0.12 gyr_Y_value 0.00
#7 0.04 gyr_Y_value 0.00
#8 0.03 gyr_Y_value -0.25
#9 0.01 gyr_Z_value -0.28
#10 0.12 gyr_Z_value 0.05
#11 0.04 gyr_Z_value 0.17
#12 0.03 gyr_Z_value 0.15
The group_by()
function then sets up the grouping by variable
, and the summarize()
function calculates the variance separately within the groupings.
Upvotes: 0
Reputation: 773
You can use rbind. Given data frames a, b, and c, they can be combined by row with
combined <- rbind(a,b,c)
See here for detailed usage.. Then you can use var() as usual on a given column, for example, combined[, 2].
Upvotes: 0
Reputation: 887223
We can place the datasets in a list
, extract the 'gyr_X_value' column, and use the rowVars
if we need to find the variance of each row
library(matrixStats)
rowVars(sapply(list(df1, df2, df3), `[[`, 'gyr_X_value'))
Suppose, the interest is to find variance of the specific column for each dataset, then use var
after extracting the column
sapply(list(df1, df2, df3), function(x) var(x[['gyr_X_value']]))
Note: The object names are assumed as 'df1', 'df2', 'df3'
Upvotes: 1