Nadine Hussein
Nadine Hussein

Reputation: 25

Pivot certain rows within a data frame into columns in R?

I have a data set that looks like the following:

data
# A tibble: 12 × 3
    Number Letter      Identifier   
    <chr>  <chr>       <chr>
  1 1      A           AB   
  2 2      B           BC   
  3 <NA>   <NA>        <NA> 
  4 Site   Bow River   <NA> 
  5 Date   2020-06-01  <NA> 
  6 <NA>   <NA>        <NA> 
  7 Number Letter      Identifier   
  8 3      C           CD   
  9 4      D           DE   
 10 <NA>   <NA>        <NA> 
 11 Site   Elbow River <NA> 
 12 Date   2019-05-01  <NA>

I want to be able to subset and pivot the rows with Site and Date into columns and fill downwards in order to get it to look like this:

# A tibble: 4 × 5
  Number Letter Identifer Site        Date      
  <chr>  <chr>  <chr>     <chr>       <chr>     
1 1      A      AB        Bow River   2020-06-01
2 2      B      BC        Bow River   2020-06-01
3 3      C      CD        Elbow River 2019-05-01
4 4      D      DE        Elbow River 2019-05-01

I had no luck using pivot_longer or pivot_wider since those rows with Site and Date are in the middle of the data set and separating it in two.

I'm mostly familiar with dplyr code but tried to use base R to do it as well with no luck.

Sorry about the formatting, it's my first time posting to Stack.

Here's the reprex:

data <- tibble::tribble(~Number, ~Letter, ~Identifer, 
                "1", "A", "AB",
                "2", "B", "BC",
                NA_character_, NA_character_, NA_character_,
                "Site", "Bow River", NA_character_,
                "Date", "2020-06-01", NA_character_,
                NA_character_, NA_character_, NA_character_,
                "Number", "Letter", "Identifier",
                "3", "C", "CD",
                "4", "D", "DE",
                NA_character_, NA_character_, NA_character_,
                "Site", "Elbow River", NA_character_,
                "Date", "2020-05-01", NA_character_)

Upvotes: 1

Views: 141

Answers (1)

Jon Spring
Jon Spring

Reputation: 66415

Here's a way, probably could be made more concise.

I'm using each occasion where it says Number in the Number column as a delimiter for a new group, and using that to define the fill range for values put into the Site and Date columns.

library(tidyverse)
data %>%
  mutate(Site = if_else(Number == "Site", Letter, NA_character_),
         Date = if_else(Number == "Date", Letter, NA_character_),
         group = cumsum(coalesce(Number == "Number", 0))) %>%
  group_by(group) %>%
  fill(Site, Date, .direction = "downup") %>%
  ungroup() %>%
  filter(!is.na(Identifer), Number != "Number") %>% 
  select(-group) 

Result

# A tibble: 4 × 5
  Number Letter Identifer Site        Date      
  <chr>  <chr>  <chr>     <chr>       <chr>     
1 1      A      AB        Bow River   2020-06-01
2 2      B      BC        Bow River   2020-06-01
3 3      C      CD        Elbow River 2020-05-01
4 4      D      DE        Elbow River 2020-05-01

Upvotes: 1

Related Questions