Jazzmatazz
Jazzmatazz

Reputation: 645

Pivot Longer with Names Suffix?

I've really enjoyed using pivot_longer so far. Is there a way to take the suffix of my columns as part of the pivot_longer command? The function has a names_prefix argument but doesn't appear to allow you to use the suffix.

data <- tibble::tribble(
   ~last_name, ~first_name, ~pitcher, ~ff_avg_spin, ~si_avg_spin, ~fc_avg_spin, ~sl_avg_spin, ~ch_avg_spin, ~cu_avg_spin, ~fs_avg_spin,
      "Bauer",    "Trevor",   545333,         2286,         2276,         2539,         2687,         1441,         2464,           NA,
      "Rodon",    "Carlos",   607074,         2148,         2211,         2153,         2465,         1725,         2457,         2630,
  "Verlander",    "Justin",   434378,         2583,           NA,         2595,         2626,         1870,         2796,           NA
  )


data_long <- data %>% 
  pivot_longer(
    cols = contains("spin"), 
    names_to = "pitch_type",
    values_to = "avg_spin",
    values_drop_na = TRUE
  )

How can I take the pitch_type columns to only list the text that comes before _avg_spin? I.E. ff, si, fc, etc. Ideally, I want that text to be capitalized, but I can fix that using a mutate piped after pivot_longer

Upvotes: 8

Views: 6394

Answers (2)

HBat
HBat

Reputation: 5692

You can use names_pattern = "(.*)_avg_spin" to get rid of "_avg_spin" suffix.

data <- tibble::tribble(
   ~last_name, ~first_name, ~pitcher, ~ff_avg_spin, ~si_avg_spin, ~fc_avg_spin, ~sl_avg_spin, ~ch_avg_spin, ~cu_avg_spin, ~fs_avg_spin,
      "Bauer",    "Trevor",   545333,         2286,         2276,         2539,         2687,         1441,         2464,           NA,
      "Rodon",    "Carlos",   607074,         2148,         2211,         2153,         2465,         1725,         2457,         2630,
  "Verlander",    "Justin",   434378,         2583,           NA,         2595,         2626,         1870,         2796,           NA
  )


data %>% 
  pivot_longer(
    cols = contains("spin"), 
    names_to = "pitch_type",
    values_to = "avg_spin",
    values_drop_na = TRUE,
    names_pattern = "(.*)_avg_spin"
  )
#> # A tibble: 18 x 5
#>    last_name first_name pitcher pitch_type avg_spin
#>    <chr>     <chr>        <dbl> <chr>         <dbl>
#>  1 Bauer     Trevor      545333 ff             2286
#>  2 Bauer     Trevor      545333 si             2276
#>  3 Bauer     Trevor      545333 fc             2539
#>  4 Bauer     Trevor      545333 sl             2687
#>  5 Bauer     Trevor      545333 ch             1441
#>  6 Bauer     Trevor      545333 cu             2464
#>  7 Rodon     Carlos      607074 ff             2148
#>  8 Rodon     Carlos      607074 si             2211
#>  9 Rodon     Carlos      607074 fc             2153
#> 10 Rodon     Carlos      607074 sl             2465
#> 11 Rodon     Carlos      607074 ch             1725
#> 12 Rodon     Carlos      607074 cu             2457
#> 13 Rodon     Carlos      607074 fs             2630
#> 14 Verlander Justin      434378 ff             2583
#> 15 Verlander Justin      434378 fc             2595
#> 16 Verlander Justin      434378 sl             2626
#> 17 Verlander Justin      434378 ch             1870
#> 18 Verlander Justin      434378 cu             2796

Upvotes: 8

iago
iago

Reputation: 3256

If I do not understand wrongly your question, the answer is using pivot_longer_spec:

library(dplyr)
library(tidyr)
library(magrittr)
data %T>% 
  {nspec <<- build_longer_spec(.,
    cols = contains("spin"), 
    names_to = "pitch_type",
    values_to = "avg_spin"
  ) %>%
    mutate(pitch_type = sub("_spin", "", pitch_type))} %>%
  pivot_longer_spec(spec = nspec, values_drop_na = TRUE)
  

Upvotes: 0

Related Questions