user10072460
user10072460

Reputation:

Monte Carlo Simulation in r using tidyverse

Here is my data:

df1<-read.table(text=" x y

2   20
3   36
3   48
1   20
3   40
3   32
1   16
1   20
3   24
3   28
3   32
4   36
2   20
4   44
4   36
4   40
4   48
3   40
4   52
4   52
4   52
4   44
4   48
4   52
1   16
3   32
4   52
3   32
3   36

",header=TRUE)

I want to use the Monte Carlo Simulation using df1.

I have done the following task to do it:

df2 <- df1 %>% sample_n(size = 1000, replace = TRUE)
 lm(y~x,data=df2)

Am I correct? Could we do better? Do I need to calculate "a" and "b" and then simulate df1? If yes, could you show me, please?

Upvotes: 0

Views: 731

Answers (2)

Bruno
Bruno

Reputation: 4150

Here is another much less clear answer

library(tidymodels)
set.seed(42)
bootstrap_data <- df1 %>% 
  rsample::bootstraps(100)

fit_lm_on_bootstrap <- function(split) {
  lm(y ~ x,data= split)
}


boot_models <- bootstrap_data %>% 
  mutate(model = map(.x = splits,fit_lm_on_bootstrap),
         tidy_results = map(model,tidy)) %>% 
  unnest(tidy_results)

boot_models %>%
  filter(term == "(Intercept)") %>% 
  summarise_at(vars(estimate:p.value),mean)

# A tibble: 1 x 4
  estimate std.error statistic p.value
     <dbl>     <dbl>     <dbl>   <dbl>
1     4.07      3.77      1.23   0.298

boot_models %>%
  filter(term == "x") %>% 
  summarise_at(vars(estimate:p.value),mean)

# A tibble: 1 x 4
  estimate std.error statistic     p.value
     <dbl>     <dbl>     <dbl>       <dbl>
1     10.4      1.16      9.25 0.000000136

Upvotes: 2

Bruno
Bruno

Reputation: 4150

One cool way is using the infer package

library(tidyverse)
library(infer)

 df1 %>%
  specify(y ~ x) %>%
  generate(reps = 100, type = "bootstrap") %>%
  calculate(stat = "correlation") %>% 
  summarise(odds = stat %>% mean(),sd = stat %>% sd)

df1 %>%
  specify(y ~ x) %>%
  generate(reps = 100, type = "bootstrap") %>%
  calculate(stat = "slope") %>% 
  summarise(beta = stat %>% mean,sd = stat %>% sd)

Upvotes: 0

Related Questions