stackinator
stackinator

Reputation: 5819

rewrite a base R function with dplyr - utilizing filter instead of []

makeparts <- function(x, n) {
  x <- unique(c(0, x))
  x <- x[x >= 0 & x < n]
  x <- x[order(x)]
  x <- rep(c(seq_along(x)), diff(c(x, n)))
  x
}

makeparts(c(20, 30, 58), 100)

How would I rewrite this function using dplyr? I am pretty good in the tidyverse but never learned base R. I don't even know what that function is doing above. If I see it in tidyverse syntax I can understand the function (probably). Which is my ultimate goal.

All the tidyverse verbs make sense, but this [, x] [[df]] stuff doesn't.

Upvotes: 1

Views: 191

Answers (2)

tblznbits
tblznbits

Reputation: 6776

x looks like it is a vector. The first step uses unique, which would be the same as distinct in the tidyverse. The next line uses the [ operator, which is used to index a vector or matrix. The value inside of [ ] should (for all intents and purposes) evaluate to a vector of TRUE or FALSE values or numbers. This is the same as filter in the tidyverse. The next line uses order on x, which is the same as arrange. The last step does two things: 1) it repeats the values from seq_along(x), which, in this example, will be 1, 2, 3, and 4. It then concats x and n together, which gives c(0, 20, 30, 58, 100) and then runs diff on them, which will take the second element and subtract the first, take the third element and subtract the second, etc. This gives us c(20, 10, 28, 42) because (20-0) = 20, (30-20) = 10, and so forth. This last step is what could be achieved in the tidyverse using the lag function. The rep function does not have a direct tidyverse equivalent. As was mentioned in the comments above, this cannot be converted to tidyverse functions because those are for dataframes and you have a vector. I agree that you should learn base R. You can only get so far with the tidyverse.

UPDATE:

Adding a tidyverse version of this code by request.

makeparts <- function(x, n) {
    x <- unique(c(0, x))
    x <- x[x >= 0 & x < n]
    x <- x[order(x)]
    x <- rep(c(seq_along(x)), diff(c(x, n)))
    x
}

makeparts_tidyverse <- function(x, n) {
    df = data_frame(x = c(0, x)) %>%
        distinct() %>%
        filter(x >= 0 & x < n) %>%
        arrange(x) %>%
        bind_rows(data_frame(x = n)) %>%
        mutate(lag_x = lag(x)) %>%
        mutate(y = x - lag_x) %>%
        filter(!is.na(y))
    rep(seq_along(df$x), df$y)
}

> makeparts(c(20, 30, 58), 100)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [21] 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
 [41] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4
 [61] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
 [81] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

> makeparts_tidyverse(c(20, 30, 58), 100)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [21] 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
 [41] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4
 [61] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
 [81] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

Upvotes: 1

moodymudskipper
moodymudskipper

Reputation: 47340

Here's a reformatted version using more tidyverse-like code :

x %>% 
  unique %>%
  keep(~.>=0 & .<n) %>%
  sort %>%
  c(0,.,n) %>%
  diff %>%
  list(lengths = ., values = seq_along(.)) %>%
  inverse.rle

# [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
# [31] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4
# [61] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
# [91] 4 4 4 4 4 4 4 4 4 4

Upvotes: 2

Related Questions