Reputation: 12719
A recent SO answer, shamelessly copied, used dplyr::pivot_longer to process 6 variables into three.
I can understand the logic for all the pivot_longer arguments except for the names to
'.values'
input.
I can work out what it does: it creates the new variable names based on the first bracketed regex expression in the names_pattern
argument.
My question is how does '.values' work?
I can see it is used in the pivot_longer function examples section for "Multiple observations per row"; but no explanation is given in the example.
It feels as if it could be a regex option .
means matches any character except \n; or is it a 'pronoun' type of output which seems to be common in the 'tidyverse' meaning something like 'the output or value of the regex expression'?
Any guidance or pointers where to find information on how to understand the intricacies of pivot_longer would be appreciated.
Or is it just a case of experimenting with the function and understanding what it does by doing?
Link to original question: [pivot longer with multiple columns and values
library(tibble)
library(tidyr)
tib <- tibble(type = c(1L, 1L, 1L, 2L, 2L, 2L),
id = c(1L, 2L, 3L, 1L, 2L, 3L),
age2000 = c(20L, 35L, 24L, 32L, 66L, 14L),
age2001 = c(21L, 36L, 25L, 33L, 67L, 15L),
age2002 = c(22L, 37L, 26L, 34L, 68L, 16L),
bool2000 = c(1L, 2L, 1L, 2L, 2L, 1L),
bool2001 = c(1L, 2L, 1L, 2L, 2L, 1L),
bool2002 = c(1L, 2L, 1L, 2L, 2L, 1L))
pivot_longer(tib,
cols = -c(id, type),
names_to = c('.value', 'year'),
names_pattern = '([a-z]+)(\\d+)')
Upvotes: 2
Views: 1239
Reputation: 13319
From the source code, .value
sets values_to
to NULL
such that it does not use the names in values_to
but the names of the cell itself.
If you look at this line:
if (".value" %in% names_to) {
values_to <- NULL
}
Then:
out <- tibble(.name = cols)
out[[".value"]] <- values_to
out <- vec_cbind(out, names)
out
}
out[[.value]]
will select columns except id and type which can then be renamed with names_pattern
. Since names are in the format age2000
, the names_pattern
breaks age2000
for instance to age
and 2000
with the latter taking year
while .value
ensures the former keeps what comes out of the regex(age here).
Upvotes: 2