Reputation: 21
Type <- c("Bark", "Redwood", "Oak")
size <- c(10,15,13)
width <- c(3,4,5)
Ratio <- size/width
df <- data.frame(Type, size, width, Ratio)
mutate(df, ratio_log = log10(Ratio))
df %>% group_by(Type) %>% shapiro.test(ratio_log)
Error in shapiro.test(., ratio_log) : unused argument (ratio_log)
I am attempting to apply the Shapiro test for all of the types, e.g, bark, redwood, oak. not all the ratios combined. I have a larger data set that consists of more ratios.
Upvotes: 1
Views: 7807
Reputation: 782
You need tidyverse for purrr and dplyr at least.
And I made more samples in the example since you need a vector for shapiro.test
and not a single ratio. So here is 100 samples from a normal, a binomial and a uniform distribution.
library(tidyverse)
Type <- c("Bark", "Redwood", "Oak")
size <- c(10,15,13)
width <- c(3,4,5)
Ratio <- c(rnorm(100),
rbinom(100, size = 2, prob = 0.2),
runif(100))
Put these in a data.frame
# Need minimum sample size for shapiro test
df <- data.frame(Type = rep(Type, each = 100),
Size = rep(size, each = 100),
width = rep(size, each = 100),
Ratio)
Then you can use the ratio_log, in this case I took the liberty of just using the same ratio. You can group by Type
and use nest
to nest a data.frame of the data per group.
df %>%
mutate(ratio_log = Ratio) %>%
group_by(Type) %>%
mutate(N_Samples = n()) %>%
nest()
# A tibble: 3 x 2
Type data
<fct> <list>
1 Bark <tibble [100 x 5]>
2 Redwood <tibble [100 x 5]>
3 Oak <tibble [100 x 5]>
You can then use the map
function together with mutate
to basically do lapply
applied to the nested data.frames (or tibbles, same thing essentially here.) To each data.frame per group we apply the shapiro.test
function to the values in the ratio_log
column.
# Use purrr::nest and purrr::map to do shapiro tests per group
df.shapiro <- df %>%
mutate(ratio_log = Ratio) %>%
group_by(Type) %>%
mutate(N_Samples = n()) %>%
nest() %>%
mutate(Shapiro = map(data, ~ shapiro.test(.x$ratio_log)))
# A tibble: 3 x 3
Type data Shapiro
<fct> <list> <list>
1 Bark <tibble [100 x 5]> <S3: htest>
2 Redwood <tibble [100 x 5]> <S3: htest>
3 Oak <tibble [100 x 5]> <S3: htest>
Now you have nested shapiro.test
results, applied to each group.
To get the relevant parameters you can use glance
from the broom
package. Then unnest
the result from the glance
function.
# Use broom::glance and purrr::unnest to get all relevant statistics
library(broom)
df.shapiro.glance <- df.shapiro %>%
mutate(glance_shapiro = Shapiro %>% map(glance)) %>%
unnest(glance_shapiro)
Type data Shapiro statistic p.value method
<fct> <list> <list> <dbl> <dbl> <fct>
1 Bark <tibble [100 x 5]> <S3: htest> 0.967 1.30e- 2 Shapiro-Wilk normality test
2 Redwood <tibble [100 x 5]> <S3: htest> 0.638 2.45e-14 Shapiro-Wilk normality test
3 Oak <tibble [100 x 5]> <S3: htest> 0.937 1.31e- 4 Shapiro-Wilk normality test
Upvotes: 7
Reputation: 1615
library(dplyr)
Type <- c("Bark", "Redwood", "Oak")
size <- c(10,15,13)
width <- c(3,4,5)
Ratio <- size/width
df <- data.frame(Type, size, width, Ratio)
df %>%
mutate(ratio_log = log10(Ratio)) %>%
group_by(Type) %>%
summarise(results = data_frame(shapiro.test(.$ratio_log)))
You an also see other solutions here: purrr map a t.test onto a split df
Upvotes: 0