Paul
Paul

Reputation: 2959

How to attach a suffix to `.value` when using `pivot_longer()` and `names_pattern`

I would like to use pivot_longer() from {tidyr} with names_pattern to convert my data to long format while keeping the prefix string from one of the pattern matches in the column names.

This seems counter-intuitive, but I want to convert to long format before applying data dictionary cleaning steps, which requires the original column names.

Set-up

library(dplyr)
library(tidyr)

d <- tibble(id = 1,
            other_var = "foo",
            suffix_t1_value1 = "a",
            suffix_t1_value2 = "b",
            suffix_t2_value1 = "c",
            suffix_t2_value2 = "d") 

What I've done

> pivot_longer(d,
               starts_with("suffix"),
               names_pattern = "suffix_t(1|2)_(.*)",
               names_to = c("rep", ".value"))

# A tibble: 2 x 5
     id other_var rep   value1 value2
  <dbl> <chr>     <chr> <chr>  <chr> 
1     1 foo       1     a      b     
2     1 foo       2     c      d    

Desired output

# A tibble: 2 x 5
     id other_var rep   suffix_t1_value1 suffix_t1_value2
  <dbl> <chr>     <chr> <chr>            <chr>           
1     1 foo       1     a                b               
2     1 foo       2     c                d               

What I've tried

Attempt 1

> pivot_longer(d,
               starts_with("suffix"),
               names_pattern = "suffix_t(1|2)_(.*)",
               names_to = c("rep", "suffix_t1_{.value}"))

Attempt 2

> pivot_longer(d,
               starts_with("suffix"),
               names_pattern = "suffix_t(1|2)_(.*)",
               names_to = c("rep", paste0("suffix_t1_", ".value")))

Upvotes: 5

Views: 674

Answers (1)

TimTeaFan
TimTeaFan

Reputation: 18561

I assume you want to do it in one step within pivot_longer. I haven't figured out yet, if thats possible, but if a two step process would be ok, then the approach below should work:

library(dplyr)
library(tidyr)

d %>% pivot_longer(starts_with("suffix"),
                   names_pattern = "suffix_t(1|2)_(.*)",
                   names_to = c("rep", ".value")
             ) %>% 
  rename_with(~ gsub("(.*)", "suffix_t1_\\1", .x),
              starts_with("value"))

#> # A tibble: 2 x 5
#>      id other_var rep   suffix_t1_value1 suffix_t1_value2
#>   <dbl> <chr>     <chr> <chr>            <chr>           
#> 1     1 foo       1     a                b               
#> 2     1 foo       2     c                d

Created on 2021-06-09 by the reprex package (v0.3.0)


Update

After digging into pivot_longer a bit, I don't think that it is possible to access .value within paste and also glue syntax {.value} does not seem to be supported.

However, {tidyr} offers the building blocks for pivoting with build_longer_spec which allows us the create our own my_pivot_longer function where we can include a names_fn argument which will apply a function to the new column names, and here we could use gsub to add a prefix or suffix.

my_pivot_longer <- function(data,
                            cols,
                            names_to = "name",
                            names_pattern = NULL,
                            names_fn = NULL) {
  
  spec <- build_longer_spec(data,
                            cols,
                            names_pattern = names_pattern,
                            names_to = names_to)

  if (!is.null(names_fn)) {
    fn <- rlang::as_function(names_fn)
    spec$.value <-  fn(spec$.value)
  }
  
  pivot_longer_spec(data, spec)
    
}

d %>% 
  my_pivot_longer(starts_with("suffix"),
                  names_pattern = "suffix_t(1|2)_(.*)",
                  names_to = c("rep", ".value"),
                  names_fn = ~ gsub("(.*)", "suffix_t1_\\1", .x))
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(cols)` instead of `cols` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.

#> This message is displayed once per session.
#> # A tibble: 2 x 5
#>      id other_var rep   suffix_t1_value1 suffix_t1_value2
#>   <dbl> <chr>     <chr> <chr>            <chr>           
#> 1     1 foo       1     a                b               
#> 2     1 foo       2     c                d

Created on 2021-06-09 by the reprex package (v0.3.0)

Upvotes: 3

Related Questions