adding a row to a data frame in long format

Question

Given a dataframe df like below

text <- "
parameter,car,qtr,val
a,a3,FY18Q1,23
b,a3,FY18Q1,10000
a,a3,FY18Q2,14
b,a3,FY18Q2,12000
a,cla,FY18Q1,15
b,cla,FY18Q1,12000
c,cla,FY18Q1,5.5
a,cla,FY18Q2,26
b,cla,FY18Q2,10000
c,cla,FY18Q2,6.2
"
df <- read.table(textConnection(text), sep = ",", header = TRUE)

I want to add a row with parameter b_diff for each car, qtr combination with val as difference of parameter b for two consecutive qtr. The qtr ascending order is FY18Q1, FY18Q2. For the first qtr which is FY18Q1, the val for b_diff shall be NA as there is no previous qtr.

The expected output is as below.

parameter   car qtr val
a   a3  FY18Q1  23
b   a3  FY18Q1  10000
b_diff  a3  FY18Q1  NA
a   a3  FY18Q2  14
b   a3  FY18Q2  12000
b_diff  a3  FY18Q2  2000
a   cla FY18Q1  15
b   cla FY18Q1  12000
c   cla FY18Q1  5.5
b_diff  cla FY18Q1  NA
a   cla FY18Q2  26
b   cla FY18Q2  10000
c   cla FY18Q2  6.2
b_diff  cla FY18Q2  -2000

How do I go about doing this with dplyr ?

www · Accepted Answer

A solution using dplyr and purrr. We can create a group ID using group_indices and based on that to split the data frame, summarize the data and then combine them. df5 is the final output.

library(dplyr)
library(purrr)

df2 <- df %>% mutate(GroupID = group_indices(., car, qtr))

df3 <- df2 %>%
  filter(parameter %in% "b") %>%
  group_by(car) %>%
  mutate(val = val - lag(val), parameter = "b_diff") %>%
  ungroup() %>%
  split(f = .$GroupID)

df4 <- df2 %>% split(f = .$GroupID)

df5 <- map2_dfr(df4, df3, bind_rows) %>% select(-GroupID)

df5
#    parameter car    qtr     val
# 1          a  a3 FY18Q1    23.0
# 2          b  a3 FY18Q1 10000.0
# 3     b_diff  a3 FY18Q1      NA
# 4          a  a3 FY18Q2    14.0
# 5          b  a3 FY18Q2 12000.0
# 6     b_diff  a3 FY18Q2  2000.0
# 7          a cla FY18Q1    15.0
# 8          b cla FY18Q1 12000.0
# 9          c cla FY18Q1     5.5
# 10    b_diff cla FY18Q1      NA
# 11         a cla FY18Q2    26.0
# 12         b cla FY18Q2 10000.0
# 13         c cla FY18Q2     6.2
# 14    b_diff cla FY18Q2 -2000.0

DATA

Notice that it is better to have stringsAsFactors = FALSE.

text <- "
parameter,car,qtr,val
a,a3,FY18Q1,23
b,a3,FY18Q1,10000
a,a3,FY18Q2,14
b,a3,FY18Q2,12000
a,cla,FY18Q1,15
b,cla,FY18Q1,12000
c,cla,FY18Q1,5.5
a,cla,FY18Q2,26
b,cla,FY18Q2,10000
c,cla,FY18Q2,6.2
"
df <- read.table(textConnection(text), sep = ",", header = TRUE, stringsAsFactors = FALSE)

adding a row to a data frame in long format

Answers (2)

Related Questions