amatsuo_net
amatsuo_net

Reputation: 2448

Extracting a part of vector and concatenate in tidy way

I have a tibble where one variable is a character vector. I want to extract a part of that vector and concatenate to create a new field. There are two other fields to determine the start index and end index. This is a toy example.

library(tidyverse)
df <- 
  tibble(id = seq(4), 
       y = list(letters[1:4], letters[2:7], letters[3:10], letters[3:7]), 
       start_pos = c(2, 1, 3, 2), 
       end_pos = c(2, 3, 5, 3)) 
df
#> # A tibble: 4 x 4
#>      id y         start_pos end_pos
#>   <int> <list>        <dbl>   <dbl>
#> 1     1 <chr [4]>         2       2
#> 2     2 <chr [6]>         1       3
#> 3     3 <chr [8]>         3       5
#> 4     4 <chr [5]>         2       3

The solution I came up with the following but this (especially lapply part) seems unnecessarily complicated. Is there a smarter way to do it to get the same result?

df %>%
  mutate(strng = 
         lapply(seq(length(y)), function(x) y[[x]][start_pos[x]:end_pos[x]] %>% paste(collapse = " "))) %>% 
  unnest(strng)
#> # A tibble: 4 x 5
#>      id y         start_pos end_pos strng
#>   <int> <list>        <dbl>   <dbl> <chr>
#> 1     1 <chr [4]>         2       2 b    
#> 2     2 <chr [6]>         1       3 b c d
#> 3     3 <chr [8]>         3       5 e f g
#> 4     4 <chr [5]>         2       3 d e

Created on 2020-06-06 by the reprex package (v0.3.0)

Upvotes: 2

Views: 90

Answers (2)

Darren Tsai
Darren Tsai

Reputation: 35554

You can use rowwise():

library(dplyr)

df %>%
  rowwise() %>% 
  mutate(strng = paste(y[start_pos:end_pos], collapse = " ")) %>%
  ungroup()

# # A tibble: 4 x 5
#      id y         start_pos end_pos strng
#   <int> <list>        <dbl>   <dbl> <chr>
# 1     1 <chr [4]>         2       2 b    
# 2     2 <chr [6]>         1       3 b c d
# 3     3 <chr [8]>         3       5 e f g
# 4     4 <chr [5]>         2       3 d e  

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388962

We can use pmap :

library(dplyr)
df %>%
  mutate(strng = purrr::pmap_chr(list(y, start_pos, end_pos), 
                           ~paste(..1[..2:..3], collapse = " ")))

# A tibble: 4 x 5
#     id y         start_pos end_pos strng
#  <int> <list>        <dbl>   <dbl> <chr>
#1     1 <chr [4]>         2       2 b    
#2     2 <chr [6]>         1       3 b c d
#3     3 <chr [8]>         3       5 e f g
#4     4 <chr [5]>         2       3 d e  

In base R, we can use mapply :

df$strng <- mapply(function(x, y, z) paste(x[y:z], collapse = " "), 
                   df$y, df$start_pos, df$end_pos)

Upvotes: 2

Related Questions