Range to histogram

Question

I'm trying to build a histogram from my data. It's look like this: a data frame where in each row a data range. I need to get the histogram of all values in my df.

year <- c("1925:2002",
          "2008",
          "1925:2002",
          "1925:2002",
          "1925:2002",
          "2008:2013",
          "1934",
          "1972:1988")

All I was able to figure out is to convert every string to a sequence with seq() but it doesn't work properly

for (i in 1:length(year)) {
  rr[i] <- seq(
    as.numeric(unlist(strsplit(year[i], ":"))[1]),
    as.numeric(unlist(strsplit(year[i], ":"))[2])
  )
}

Here is an example

hrbrmstr · Accepted Answer

Tick the answer box for @MrFlick. I had done this at the same time and the only difference is the piping:

library(magrittr)

strsplit(year, ":") %>% 
  lapply(as.integer) %>% 
  lapply(function(x) seq(x[1], x[length(x)])) %>% 
  unlist() %>% 
  hist()

Full-on tidyverse:

library(tidyverse)

str_split(year, ":") %>%
  map(as.integer) %>% 
  map(~seq(.x[1], .x[length(.x)])) %>% 
  flatten_int() %>% 
  hist()

To defend my comments hence any tidyverse 4eva folks join in the fray:

library(tidyverse)
library(microbenchmark)

microbenchmark(
  base = as.integer(
    unlist(
      lapply(
        lapply(
          strsplit(year, ":"),
          as.integer
        ),
        function(x) seq(x[1], x[length(x)])
      ),
      use.names = FALSE
    )
  ),
  tidy = str_split(year, ":") %>%
    map(as.integer) %>% 
    map(~seq(.x[1], .x[length(.x)])) %>% 
    flatten_int()
)
## Unit: microseconds
##  expr     min      lq     mean   median       uq      max neval
##  base  89.099  96.699 132.1684 102.5895 110.7165 2895.428   100
##  tidy 631.817 647.812 672.5904 667.8250 686.2740  909.531   100

Range to histogram

Answers (2)

Related Questions