Shinobi_Atobe
Shinobi_Atobe

Reputation: 1983

Reading data with multiple column headers

I have a csv file with 2 column headers, like so:

tibble(brand_1 = c("region_1", 12, 4),
       brand_2 = c("region_1", 8, 10),
       brand_3 = c("region_2", 7, 12))
      brand_1  brand_2  brand_3 
      <chr>    <chr>    <chr>   
    1 region_1 region_1 region_2
    2 12       8        7       
    3 4        10       12  

What's the best way to transform that into long format, like so:

# A tibble: 6 x 3
  brand   region   value
  <chr>   <chr>    <dbl>
1 brand_1 region_1    12
2 brand_1 region_1    14
3 brand_2 region_1     8
4 brand_2 region_1    10
5 brand_3 region_2     7
6 brand_3 region_2    12

Upvotes: 0

Views: 53

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50738

Not necessarily pretty but it gets the job done

library(tidyverse)
df %>%
    magrittr::set_colnames(paste(names(.), .[1, ], sep = ";")) %>%
    filter(row_number() > 1) %>%
    gather(key, value) %>%
    separate(key, c("brand", "region"), sep = ";")
## A tibble: 6 x 3
#  brand   region   value
#  <chr>   <chr>    <chr>
#1 brand_1 region_1 12
#2 brand_1 region_1 4
#3 brand_2 region_1 8
#4 brand_2 region_1 10
#5 brand_3 region_2 7
#6 brand_3 region_2 12

Explanation: The idea is to concatenate the two column headers, then reshape the data from wide to long, and finally to separate the concatenated header entries with separate.


Sample data

df <- tibble(brand_1 = c("region_1", 12, 4),
       brand_2 = c("region_1", 8, 10),
       brand_3 = c("region_2", 7, 12))

Upvotes: 2

Related Questions