Reputation: 1584
I'm here again to ask the colleagues to help me to improve this code. Since this forum is so receptive, I would like to hear your thoughts about this situation!
The question is pretty simple but needs knowledge about looping / for() / by() etc and I'm almost autodidact when programming in R:
I'm working on a dataset where I have estimates for proprtions and I want to compute the lower and upper boundaries for the confidence interval. The work consists of transforming this:
to this:
There is a package that computes that for me
library(PropCIs)
The solution was intuitive (for me).
1) Get every point estimate,
2) compute its lower and upper boundaries and
3) move to the next row and
4) Return to the first step.
My workaround was based on this post here and is described below step by step. However, I imagine this solution is too slow, naive or works like I'm talking with a strange accent to any native data scientist. Then, I'm wondering if tidyverse could help me to improve that.
library(PropCIs)
library(tidyverse)
set.seed(123)
ds <- data.frame(estimate = runif(15, min=0, max=1),
sample = sample(x = 10:15, 15, replace = T))
ds <- ds %>% mutate(lower = '')
#looping
for(i in 1:nrow(ds)) {
ds$lower[i] <- blakerci(ds$sample[i], 3449, conf.level=0.95)
}
#row to columns
ds <- separate(data = ds, col = lower, into = c("lower", "upper"), sep = ",")
#replace strings
ds <- ds %>% mutate(lower = gsub("c(", "", lower, fixed = TRUE),
upper = gsub(")", "", upper, fixed = TRUE))
#Transform to numeric
ds <- ds %>% mutate_at(vars(lower, upper), funs(as.numeric(.)))
As always, thanks much for all support!
Please take into consideration this post has a reproducible script and can help other people! =)
Upvotes: 0
Views: 359
Reputation: 883
I don't use PropCIs
before, but it seems cool to think in probability.
Here is the code in tidyverse
style.
library(PropCIs)
library(tidyverse)
library(data.table)
#>
#> 载入程辑包:'data.table'
#> The following objects are masked from 'package:dplyr':
#>
#> between, first, last
#> The following object is masked from 'package:purrr':
#>
#> transpose
set.seed(123)
data.table(
# data.table is faster.
estimate = runif(15, min=0, max=1)
,sample = sample(x = 10:15, 15, replace = T)
) %>%
mutate(
ci = map(sample
,~blakerci(., 3449, conf.level=0.95) %>%
.$conf.int %>%
.[1:2]
# get lower and upper
)
# why n = 3449? Any assumption?
) %>%
mutate(
lower = map_dbl(ci,~.[1])
,upper = map_dbl(ci,~.[2])
) %>%
select(-ci)
#> estimate sample lower upper
#> 1 0.2875775 15 0.002476132 0.007123028
#> 2 0.7883051 11 0.001593144 0.005669411
#> 3 0.4089769 10 0.001481219 0.005245599
#> 4 0.8830174 11 0.001593144 0.005669411
#> 5 0.9404673 15 0.002476132 0.007123028
#> 6 0.0455565 15 0.002476132 0.007123028
#> 7 0.5281055 14 0.002350896 0.006691190
#> 8 0.8924190 13 0.002038424 0.006396837
#> 9 0.5514350 15 0.002476132 0.007123028
#> 10 0.4566147 13 0.002038424 0.006396837
#> 11 0.9568333 14 0.002350896 0.006691190
#> 12 0.4533342 13 0.002038424 0.006396837
#> 13 0.6775706 13 0.002038424 0.006396837
#> 14 0.5726334 11 0.001593144 0.005669411
#> 15 0.1029247 10 0.001481219 0.005245599
Created on 2018-11-07 by the reprex package (v0.2.1)
devtools::session_info()
#> Session info -------------------------------------------------------------
#> setting value
#> version R version 3.5.1 (2018-07-02)
#> system x86_64, darwin15.6.0
#> ui X11
#> language (EN)
#> collate zh_CN.UTF-8
#> tz Asia/Shanghai
#> date 2018-11-07
#> Packages -----------------------------------------------------------------
#> package * version date source
#> assertthat 0.2.0 2017-04-11 CRAN (R 3.5.0)
#> backports 1.1.2 2017-12-13 CRAN (R 3.5.0)
#> base * 3.5.1 2018-07-05 local
#> bindr 0.1.1 2018-03-13 CRAN (R 3.5.0)
#> bindrcpp * 0.2.2 2018-03-29 CRAN (R 3.5.0)
#> broom 0.5.0 2018-07-17 CRAN (R 3.5.0)
#> cellranger 1.1.0 2016-07-27 CRAN (R 3.5.0)
#> cli 1.0.0 2017-11-05 CRAN (R 3.5.0)
#> colorspace 1.3-2 2016-12-14 CRAN (R 3.5.0)
#> compiler 3.5.1 2018-07-05 local
#> crayon 1.3.4 2017-09-16 CRAN (R 3.5.0)
#> data.table * 1.11.8 2018-09-30 cran (@1.11.8)
#> datasets * 3.5.1 2018-07-05 local
#> devtools 1.13.6 2018-06-27 CRAN (R 3.5.0)
#> digest 0.6.16 2018-08-22 cran (@0.6.16)
#> dplyr * 0.7.6 2018-06-29 CRAN (R 3.5.1)
#> evaluate 0.11 2018-07-17 CRAN (R 3.5.0)
#> forcats * 0.3.0 2018-02-19 CRAN (R 3.5.0)
#> ggplot2 * 3.0.0 2018-07-03 CRAN (R 3.5.0)
#> glue 1.3.0 2018-07-17 CRAN (R 3.5.0)
#> graphics * 3.5.1 2018-07-05 local
#> grDevices * 3.5.1 2018-07-05 local
#> grid 3.5.1 2018-07-05 local
#> gtable 0.2.0 2016-02-26 CRAN (R 3.5.0)
#> haven 1.1.2 2018-06-27 CRAN (R 3.5.0)
#> hms 0.4.2 2018-03-10 CRAN (R 3.5.0)
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.5.0)
#> httr 1.3.1 2017-08-20 CRAN (R 3.5.0)
#> jsonlite 1.5 2017-06-01 CRAN (R 3.5.0)
#> knitr 1.20 2018-02-20 CRAN (R 3.5.0)
#> lattice 0.20-35 2017-03-25 CRAN (R 3.5.1)
#> lazyeval 0.2.1 2017-10-29 CRAN (R 3.5.0)
#> lubridate 1.7.4 2018-04-11 CRAN (R 3.5.0)
#> magrittr 1.5 2014-11-22 CRAN (R 3.5.0)
#> memoise 1.1.0 2017-04-21 CRAN (R 3.5.0)
#> methods * 3.5.1 2018-07-05 local
#> modelr 0.1.2 2018-05-11 CRAN (R 3.5.0)
#> munsell 0.5.0 2018-06-12 CRAN (R 3.5.0)
#> nlme 3.1-137 2018-04-07 CRAN (R 3.5.1)
#> pillar 1.3.0 2018-07-14 CRAN (R 3.5.0)
#> pkgconfig 2.0.1 2017-03-21 CRAN (R 3.5.0)
#> plyr 1.8.4 2016-06-08 CRAN (R 3.5.0)
#> PropCIs * 0.3-0 2018-02-23 CRAN (R 3.5.0)
#> purrr * 0.2.5 2018-05-29 CRAN (R 3.5.0)
#> R6 2.3.0 2018-10-04 cran (@2.3.0)
#> Rcpp 0.12.19 2018-10-01 cran (@0.12.19)
#> readr * 1.1.1 2017-05-16 CRAN (R 3.5.0)
#> readxl 1.1.0 2018-04-20 CRAN (R 3.5.0)
#> rlang 0.2.2 2018-08-16 cran (@0.2.2)
#> rmarkdown 1.10 2018-06-11 CRAN (R 3.5.0)
#> rprojroot 1.3-2 2018-01-03 CRAN (R 3.5.0)
#> rvest 0.3.2 2016-06-17 CRAN (R 3.5.0)
#> scales 1.0.0 2018-08-09 CRAN (R 3.5.0)
#> stats * 3.5.1 2018-07-05 local
#> stringi 1.2.4 2018-07-20 CRAN (R 3.5.0)
#> stringr * 1.3.1 2018-05-10 CRAN (R 3.5.0)
#> tibble * 1.4.2 2018-01-22 CRAN (R 3.5.0)
#> tidyr * 0.8.1 2018-05-18 CRAN (R 3.5.0)
#> tidyselect 0.2.5 2018-10-11 cran (@0.2.5)
#> tidyverse * 1.2.1 2017-11-14 CRAN (R 3.5.0)
#> tools 3.5.1 2018-07-05 local
#> utils * 3.5.1 2018-07-05 local
#> withr 2.1.2 2018-03-15 CRAN (R 3.5.0)
#> xml2 1.2.0 2018-01-24 CRAN (R 3.5.0)
#> yaml 2.2.0 2018-07-25 CRAN (R 3.5.0)
Upvotes: 1