Chen Hobbit
Chen Hobbit

Reputation: 97

Extract the rows based on the specific values in the column by time

I have a data frame looks like :

Date v1 v2
2000/8/10 S 2600
2000/8/10 S 500
2000/8/10 S 3500
2001/11/12 B 2000
2001/11/22 B 1000
2001/11/22 S 1000
2001/12/06 B 1000
2001/12/06 B 1500
2001/12/06 B 1000
2001/12/07 S 2000
2001/12/07 B 4000

However, I'd like to extract the rows which has "B" and "S" in the same date It looks as below :

Date v1 v2
2001/11/22 B 1000
2001/11/22 S 1000
2001/12/07 S 2000
2001/12/07 B 4000

Does anyone have ideas? That would be grateful ! Cheers

Upvotes: 0

Views: 522

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389325

You can select the Date which has both 'S' and 'B' in them.

library(dplyr)

df %>%
  group_by(Date) %>%
  filter(all(c('S', 'B') %in% v1)) %>%
  ungroup

#  Date       v1       v2
#  <chr>      <chr> <int>
#1 2001/11/22 B      1000
#2 2001/11/22 S      1000
#3 2001/12/07 S      2000
#4 2001/12/07 B      4000

This logic can also be implemented in base R and data.table :

#Base R
subset(df, as.logical(ave(v1, Date, FUN = function(x) all(c('S', 'B') %in% x))))

#data.table
library(data.table)
setDT(df)[, .SD[all(c('S', 'B') %in% v1)], Date]

data

df <- structure(list(Date = c("2000/8/10", "2000/8/10", "2000/8/10", 
"2001/11/12", "2001/11/22", "2001/11/22", "2001/12/06", "2001/12/06", 
"2001/12/06", "2001/12/07", "2001/12/07"), v1 = c("S", "S", "S", 
"B", "B", "S", "B", "B", "B", "S", "B"), v2 = c(2600L, 500L, 
3500L, 2000L, 1000L, 1000L, 1000L, 1500L, 1000L, 2000L, 4000L
)), row.names = c(NA, -11L), class = "data.frame")

Upvotes: 2

Related Questions