Reputation: 93
I would like to know if there is a way to "compact" data frames in R.
I have a change tracker from a document. Currently, my fata frame looks like this:
Here we can see that there is a change per row, which is then separated in different section and if information is added and removed.
I would like to know if there is any function in R that allows users to "compact" the information in the data frames like so:
Here the information is summarised in each column for each sectio, making the table a little bit more human readable.
Is this possible?
Thank you!
Here you can dinf some data to reproduce this case:
DOCUMENT<- c("DOC1","DOC1","DOC1","DOC1","DOC1","DOC1")
DATE<- c("day","day","day","day","day","day")
SectionA.added<- c("Change 1", "Change2", "change3", NA, NA, NA)
SectionA.deleted<- c(NA, NA, NA, "Change 4", NA,NA)
SectionB.added<- c(NA, NA, NA, NA, "Change5", NA)
SectionB.deleted<- c(NA, NA, NA, NA, NA, NA)
OTHERS<- c(NA, NA, NA, NA, NA, "Change 6")
changes_df <- data.frame(DOCUMENT,DATE, SectionA.added, SectionA.deleted, SectionB.added, SectionB.deleted, OTHERS )
Upvotes: 2
Views: 87
Reputation: 21400
You can use lead
to 'move up' values in the columns:
library(dplyr)
df %>%
mutate(a = lead(a,1),
b = lead(b,3),
c = lead(c,2))
ID a b c
1 1 1 1 2
2 2 2 NA NA
3 3 3 NA NA
4 4 NA NA NA
Data:
df <- data.frame(
ID = 1:4,
a = c(NA, 1,2,3),
b = c(NA, NA, NA, 1),
c = c(NA, NA, 2, NA)
)
EDIT:
This is a more general solution which works iff the last value in each column is always an integer:
df %>%
mutate(across(a:c, ~lead(., sum(is.na(.)))))
Data (adapted):
df <- data.frame(
ID = 1:4,
a = c(NA, 1,2,3),
b = c(NA, NA, NA, 1),
c = c(NA, NA, 2, 1)
)
Upvotes: 2