Parse data frame column name, and pivot data frame R

Question

When executing the following reproducible example code,

value = c(1:5)
name = c('A','B','C','D','F')
`2015.cost` = c(100,200,250,300,450)
`2016.cost` = c(200,300,350,400,550)
`2017.cost` = c(300,400,450,500,650)
`2015.profit` = c(1000,4200,2450,1500,7650)
`2016.profit` = c(1300,4300,3450,5100,6850)
`2017.profit` = c(1300,4400,6450,5001,6500)
df <- data.frame(value,name,`2015.cost`,`2016.cost`,`2017.cost`,`2015.profit`,`2016.profit`,`2017.profit`)

I create the following data frame:

structure(list(value = 1:5, name = structure(1:5, .Label = c("A", 
"B", "C", "D", "F"), class = "factor"), X2015.cost = c(100, 200, 
250, 300, 450), X2016.cost = c(200, 300, 350, 400, 550), X2017.cost = c(300, 
400, 450, 500, 650), X2015.profit = c(1000, 4200, 2450, 1500, 
7650), X2016.profit = c(1300, 4300, 3450, 5100, 6850), X2017.profit = c(1300, 
4400, 6450, 5001, 6500)), .Names = c("value", "name", "X2015.cost", 
"X2016.cost", "X2017.cost", "X2015.profit", "X2016.profit", "X2017.profit"
), row.names = c(NA, -5L), class = "data.frame")

I essentially want to be able to parse out the years in my cost (at a delimiter such as that '.') and profit column names and then pivot the data frame from this wide format to long, so that I have years all in a column and cost, profit values as another column. My goal is to have a data frame like so:

value name  year    cost    profit
1    A   2015    100    1000
2    B   2015    200    4200
3    C   2015    250    2450
4    D   2015    300    1500
5    E   2015    450    7650
1    A   2016    200    1300
2    B   2016    300    4300
3    C   2016    350    3450
4    D   2016    400    5100
5    E   2016    550    6850
1    A   2017    300    1300
2    B   2017    400    4400
3    C   2017    450    6450
4    D   2017    500    5001
5    E   2017    650    6500

Any help would be appreciated

Matthew Plourde · Accepted Answer

Here's one way with dplyr and tidyr

library(dplyr)
library(tidyr)
df %>%
    gather('col', 'val', X2015.cost:X2017.profit) %>%
    separate(col, c('year', 'col_type'), sep='\.') %>%
    mutate(year=extract_numeric(year)) %>%
    spread(col_type, val)

Parse data frame column name, and pivot data frame R

Answers (2)

Related Questions