Cedric
Cedric

Reputation: 2474

How to extract a vector in a tibble column to multiple columns in the same tibble?

The following code produces a vector of length 6 in the last column, from that column I would like to extract 6 new columns in my tibble.

require(tidyverse)
require(purrrlyr)
# this function will return a vector of the same length (6) for each group
fun=function(X,Y){
   mycut<-cut(X,breaks=seq(50,350,by=50),right=FALSE)
   v<-tapply(Y,mycut,sum)
   return(v)
}
# use the previous function to count gears per class of hp
mtcars %>%
    group_by(cyl)  %>%   
    by_slice(~fun(.x$hp,.x$gear)) %>%
    rename(cut=.out)

Here I have a vector in my column cut

# tibble [3 x 2]
     cyl       cut
  <fctr>    <list>
1      4 <dbl [6]>
2      6 <dbl [6]>
3      8 <dbl [6]>

What command do I need to pass from this vector to a table like?

cyl  [50,100) [100,150) [150,200) [200,250) [250,300) [300,350) 
   4     36         9        NA        NA        NA        NA 
   ...

unnest does not work. Do I have to work with by_row or is there a more straightforward answer?

Upvotes: 1

Views: 635

Answers (2)

Peter H.
Peter H.

Reputation: 2164

I would suggest another approach. Instead of using the deprecated by_slice() function (that now recides in the purrrlyr package), you can use this code:

mtcars %>% 
  split(.$cyl) %>% 
  map(~fun(.x$hp,.x$gear)) %>% 
  do.call(rbind, .)

Which gives the following output

  [50,100) [100,150) [150,200) [200,250) [250,300) [300,350)
4       36         9        NA        NA        NA        NA
6       NA        22         5        NA        NA        NA
8       NA        NA        21        15         5         5

Upvotes: 1

akrun
akrun

Reputation: 886938

We need to get the names of the 'cut' variable as new column and then do a spread to reshape to 'wide' format after unnesting the list elements

mtcars %>%
   group_by(cyl)  %>%   
   by_slice(~fun(.x$hp,.x$gear)) %>%
   rename(cut=.out) %>%
   mutate(Names = map(cut, ~factor(names(.x), levels = names(.x)))) %>%
   unnest %>%
   spread(Names, cut)
# A tibble: 3 x 7
#    cyl `[50,100)` `[100,150)` `[150,200)` `[200,250)` `[250,300)` `[300,350)`
#*  <dbl>      <dbl>       <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
#1     4         36           9          NA          NA          NA          NA
#2     6         NA          22           5          NA          NA          NA
#3     8         NA          NA          21          15           5           5

Upvotes: 1

Related Questions