Reputation: 425
Let's say that I have the following data and am interested in grabbing data by date where the type is "ts". Of course, there are dates where ts is not available, and I need to revert to the 'real' values for those dates.
dat = data.frame(dte = c("2011-01-01","2011-02-01","2011-03-01","2011-04-01","2011-05-01",
"2011-01-01","2011-02-01","2011-03-01"),
type = c("real","real","real","real","real","ts","ts","ts"),
value=rnorm(8))
dat
cpy = dat %>% dplyr::filter(type == "ts")
cpy
How can something like that be done in dplyr.
Expected output is:
dte type value
"2011-01-01" ts ....
"2011-02-01" ts
"2011-03-01" ts
"2011-04-01" real
"2011-05-01" real
Upvotes: 2
Views: 9295
Reputation: 887153
Using dplyr
, we can also use which.max
library(dplyr)
dat %>%
group_by(dte) %>%
slice(which.max(factor(type)))
# dte type value
# <fctr> <fctr> <dbl>
#1 2011-01-01 ts -0.5052456
#2 2011-02-01 ts -0.4038810
#3 2011-03-01 ts -1.5349627
#4 2011-04-01 real 1.6812035
#5 2011-05-01 real -0.9902754
Or using a similar option with data.table
library(data.table)
setDT(dat)[, .SD[which.max(factor(type))] , dte]
# dte type value
#1: 2011-01-01 ts -0.5052456
#2: 2011-02-01 ts -0.4038810
#3: 2011-03-01 ts -1.5349627
#4: 2011-04-01 real 1.6812035
#5: 2011-05-01 real -0.9902754
Upvotes: 0
Reputation: 21621
One idea could be to group_by()
date and keep values where type == "ts"
or when, for a given date, there are no type == "ts"
, keep the other value:
dat %>%
group_by(dte) %>%
filter(type == "ts" | !any(type == "ts"))
Which gives:
#Source: local data frame [5 x 3]
#Groups: dte [5]
#
# dte type value
# <fctr> <fctr> <dbl>
#1 2011-04-01 real 0.2522234
#2 2011-05-01 real -0.8919211
#3 2011-01-01 ts 0.4356833
#4 2011-02-01 ts -1.2375384
#5 2011-03-01 ts -0.2242679
Upvotes: 4
Reputation: 388982
You can try with base packages,
rbind(dat[dat$type == "ts",], dat[!unique(dat$dte) %in%
dat[dat$type == "ts","dte"], ])
# dte type value
#6 2011-01-01 ts -0.98109206
#7 2011-02-01 ts 1.67626166
#8 2011-03-01 ts -0.06997343
#4 2011-04-01 real 1.27243996
#5 2011-05-01 real -1.63594680
Taking the rows with type
equal to ts
and rbind
ing the remaining dates from the real
type.
Upvotes: 3