Martin Gal
Martin Gal

Reputation: 16998

Is there a base R version of tidyr's unnest() function?

I've been using tidyverse quite a lot and now I'm interested in the possibilities of base R.

Let's take a look at this simple data.frame

df <- data.frame(id = 1:4, nested = c("a, b, f", "c, d", "e", "e, f"))

Using dplyr, stringr and tidyr we could do

df %>% 
  mutate(nested = str_split(nested, ", ")) %>% 
  unnest(nested)

to get (let's ignore the tibble part)

# A tibble: 8 x 2
     id nested
  <int> <chr> 
1     1 a     
2     1 b     
3     1 f     
4     2 c     
5     2 d     
6     3 e     
7     4 e     
8     4 f    

Now we want to rebuild this one using base R tools. So

transform(df, nested = strsplit(nested, ", "))

gives use the mutate-part, but how can we unnest() this data.frame? I though of using unlist() but couldn't find a satisfying way.

Upvotes: 6

Views: 641

Answers (1)

akrun
akrun

Reputation: 887981

We could use stack on a named list in a single line

with(df, setNames(stack(setNames(strsplit(nested, ","), id))[2:1], names(df)))

-output

   id nested
1  1      a
2  1      b
3  1      f
4  2      c
5  2      d
6  3      e
7  4      e
8  4      f

If we use transform, then use rep to replicate based on the lengths of the list column

out <- transform(df, nested = strsplit(nested, ", "))
data.frame(id = rep(out$id, lengths(out$nested)), nested = unlist(out$nested))

Upvotes: 1

Related Questions