Reputation: 636
What I'm trying to do is collapse my data frame such that each unique date has the corresponding variables associated with it. Here is a sample of my data as it is right now (fyi: the full data set I'm using is around 300 obs. with different dates):
date <- c("10/30/17", "10/30/17", "10/30/17", "10/30/17")
eventcode <- c("14", "14", "14", "14")
eoi145 <- c(1, 0, 0, 0)
eoi140 <- c(0, 1, 0, 0)
eoi141 <- c(0, 0, 0, 1)
eoi143 <- c(0, 0, 1, 0)
df <- data.frame(date, eventcode, eoi145, eoi140, eoi141, eoi143)
View(df)
I want to get into this format:
date <- c("10/30/17")
eventcode <- c("14")
eoi145 <- c(1)
eoi140 <- c(1)
eoi141 <- c(1)
eoi143 <- c(1)
df <- data.frame(date, eventcode, eoi145, eoi140, eoi141, eoi143)
I've tried using cast, melt, and reshape. Can anyone give me a hint as to any packages or techniques to get this accomplished.
Thanks!
Upvotes: 3
Views: 98
Reputation: 5673
An other way with data table
dcast(melt(DT,measure.vars = patterns("^eoi"))[value != 0],date + eventcode ~variable)
date eventcode eoi145 eoi140 eoi141 eoi143
1: 10/30/17 14 1 1 1 1
you melt and take value != 0
melt(DT,measure.vars = patterns("^eoi"))[value == 1]
date eventcode variable value
1: 10/30/17 14 eoi145 1
2: 10/30/17 14 eoi140 1
3: 10/30/17 14 eoi141 1
4: 10/30/17 14 eoi143 1
and then go back to broad again with dcast. The advantage is that it works for almost any situation. Like if you have
eoi143 <- c(0, 0, 1, 1)
date eventcode eoi145 eoi140 eoi141 eoi143
1: 10/30/17 14 1 1 1 2
Upvotes: 1
Reputation: 5358
One approach from the dplyr package:
library(dplyr)
reduced_df <- df %>%
group_by(date, eventcode) %>%
summarise_all(funs(as.integer(sum(.)))) %>%
ungroup()
With output:
# A tibble: 1 x 6
# date eventcode eoi145 eoi140 eoi141 eoi143
# <fct> <fct> <int> <int> <int> <int>
# 10/30/17 14 1 1 1 1
Upvotes: 1