Reputation: 1355
I am trying to use the permute function in modelr with purrr map to calculate the mean values of two categories of data under permutation.
The function behaves as one would expect if I am trying to calculate linear models off of the permuted data sets, as per the example file for modelr::permute (though I am running the linear model inside of a custom function):
library(tidyverse)
library(modelr)
perms <- permute(mtcars, 1000, mpg)
jlm <- function(df){lm(mpg ~ wt, data = df)}
models3 <- map(perms$perm, jlm)
models3[[1]]
Call: lm(formula = mpg ~ wt, data = df) Coefficients: (Intercept) wt 28.211 -2.524
Now, instead of a linear model, I just want mean values for two categories in that data set. I tried running as follows.
mean_of_vs <- function(df){ df %>% group_by(vs) %>% summarize(mean(mpg)) %>% spread(vs, `mean(mpg)`) %>% rename(zero = `0`, one = `1`) } models4 <- map(perms$perm, ~mean_of_vs) models4[[1]]
but this just returns the function equation, rather than the output of the function
function(df){ df %>% group_by(vs) %>% summarize(mean(mpg)) %>% spread(vs, `mean(mpg)`) %>% rename(zero = `0`, one = `1`) }
The equation works by itself on a simple data frame.
test <- perms %>% pull(perm) %>% .[[1]] %>% as.data.frame
mean_of_vs(test)
# A tibble: 1 x 2 zero one <dbl> <dbl> 1 16.6 24.5
So my question is, why doesn't my custom function return a bunch of one line data frames with the mean value of vs = 0 and vs = 1 and how would I get it to do this?
Thanks.
Upvotes: 0
Views: 123
Reputation: 776
I am glad to meet you.
modelr::permute
produces the data which its class is 'permutation'
> class(perms[[1]][1][[1]])
[1] "permutation"
permutation
class has 3 attributes
The data in this variable
columns you permute
indexes indicating which rows have been selected
i think permutation
only takes some kinds of formula (like lm
and etc
.. i am not sure about the formula list).
So if you want use function you want you have to transform to data.frame/data.table/tibble like below
mean_of_vs <- function(df){
df %>%as.data.frame() %>% group_by(vs) %>% summarize(mean(mpg)) %>% spread(vs, `mean(mpg)`) %>%
rename(zero = `0`, one = `1`)
}
Then, execute map
function with out ~
notation.
models4 <- map(perms$perm, mean_of_vs)
Then you will get the result
.....
[[97]]
# A tibble: 1 x 2
zero one
<dbl> <dbl>
1 21.4 18.4
[[98]]
# A tibble: 1 x 2
zero one
<dbl> <dbl>
1 20.4 19.7
.....
Upvotes: 1
Reputation: 22588
Permute returns type <S3: permutation>
, not a data frame.
> perms
# A tibble: 1,000 x 2
perm .id
<list> <chr>
1 <S3: permutation> 0001
2 <S3: permutation> 0002
3 <S3: permutation> 0003
4 <S3: permutation> 0004
5 <S3: permutation> 0005
6 <S3: permutation> 0006
7 <S3: permutation> 0007
8 <S3: permutation> 0008
9 <S3: permutation> 0009
10 <S3: permutation> 0010
# ... with 990 more rows
Examining it reveals the data frame is stored as the first element in the named list:
> glimpse(perms[[1,1]])
List of 3
$ data :'data.frame': 32 obs. of 11 variables:
..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
..$ disp: num [1:32] 160 160 108 258 360 ...
..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
..$ wt : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
..$ vs : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
$ columns: Named chr "mpg"
..- attr(*, "names")= chr "mpg"
$ idx : int [1:32] 1 30 21 12 27 14 17 2 15 32 ...
- attr(*, "class")= chr "permutation"
So to do what you want, just access the data
element in the first step of your mean_of_vs()
function:
mean_of_vs <- function(df) {
df$data %>%
group_by(vs) %>%
summarize(mean(mpg)) %>%
spread(vs, `mean(mpg)`) %>%
rename(zero = `0`, one = `1`)
}
Now things work as expected:
> models4 <- map(perms$perm, mean_of_vs)
> models4[[1]]
# A tibble: 1 x 2
zero one
<dbl> <dbl>
1 16.6 24.6
Upvotes: 2