Reputation: 10061
You can download the example data from this link
I need to melt 2020-09
, 2020-10
, 2020-11
as date
and extract adj_price
for each pair of id
and name
.
How could I convert it to the dataframe as follows in R? Many thanks at advance.
Upvotes: 2
Views: 652
Reputation: 2977
I came out with this solution, it might be optimized but the output is what you want and it should work with any number of columns.
library(tidyverse)
df1 <- readxl::read_xlsx(path = "path to test_data.xlsx")
# get all dates from the column names
cols <- colnames(df1)[3:ncol(df1)]
dates <- cols[grep("^[0-9][0-9][0-9][0-9]-[0-9][0-9]$", cols)]
# make a vector that will be used to make column names
colnames(df1)[3:ncol(df1)] <- rep(dates, rep(3, length(dates)))
# make a table with id, name and dates
finaldf <- df1[-1,] %>% pivot_longer(cols = 3:last_col(), names_to = "dates", values_to = "values")
indicators <- df1[-1,]
colnames(indicators) <- c("id", "name", df1[1, 3:ncol(df1)])
indicators <- indicators %>% pivot_longer(cols = 3:last_col(), names_to = "indicator", values_to = "values")
# final join and formatting
finaldf <- cbind(finaldf, indicators[, "indicator"]) %>%
filter(indicator == "adj_price") %>%
select(-indicator) %>%
rename("adj_price" = values) %>%
mutate(adj_price = as.numeric(adj_price))
The output:
> finaldf
id name dates adj_price
1 1 Stracke-Huel 2020-09 3.80
2 1 Stracke-Huel 2020-10 3.72
3 1 Stracke-Huel 2020-11 3.70
4 2 Gleason-Mann 2020-09 7.25
5 2 Gleason-Mann 2020-10 7.50
6 2 Gleason-Mann 2020-11 7.50
7 3 Bauch-Cartwright 2020-09 NA
8 3 Bauch-Cartwright 2020-10 13.03
9 3 Bauch-Cartwright 2020-11 12.38
Upvotes: 2