ah bon
ah bon

Reputation: 10061

Read multiple headers excel file and melt multiple columns in R

Say I have a excel table like this

You can download the example data from this link

I need to melt 2020-09, 2020-10, 2020-11 as date and extract adj_price for each pair of id and name.

How could I convert it to the dataframe as follows in R? Many thanks at advance.

enter image description here

Upvotes: 2

Views: 652

Answers (1)

Paul
Paul

Reputation: 2977

I came out with this solution, it might be optimized but the output is what you want and it should work with any number of columns.

library(tidyverse)

df1 <- readxl::read_xlsx(path = "path to test_data.xlsx")

# get all dates from the column names
cols <- colnames(df1)[3:ncol(df1)]
dates <- cols[grep("^[0-9][0-9][0-9][0-9]-[0-9][0-9]$", cols)]

# make a vector that will be used to make column names
colnames(df1)[3:ncol(df1)] <- rep(dates, rep(3, length(dates)))


# make a table with id, name and dates

finaldf <- df1[-1,] %>% pivot_longer(cols = 3:last_col(), names_to = "dates", values_to = "values")

indicators <- df1[-1,]
colnames(indicators) <- c("id", "name", df1[1, 3:ncol(df1)])
indicators <- indicators %>% pivot_longer(cols = 3:last_col(), names_to = "indicator", values_to = "values")

# final join and formatting
finaldf <- cbind(finaldf, indicators[, "indicator"]) %>% 
  filter(indicator == "adj_price") %>% 
  select(-indicator) %>% 
  rename("adj_price" = values) %>% 
  mutate(adj_price = as.numeric(adj_price))

The output:

> finaldf
  id             name   dates adj_price
1  1     Stracke-Huel 2020-09      3.80
2  1     Stracke-Huel 2020-10      3.72
3  1     Stracke-Huel 2020-11      3.70
4  2     Gleason-Mann 2020-09      7.25
5  2     Gleason-Mann 2020-10      7.50
6  2     Gleason-Mann 2020-11      7.50
7  3 Bauch-Cartwright 2020-09        NA
8  3 Bauch-Cartwright 2020-10     13.03
9  3 Bauch-Cartwright 2020-11     12.38

Upvotes: 2

Related Questions