Reputation: 101
I have a data frame which holds the times of random events occurring. What I want, is to subset the first case when either 'place' or 'Show' appears under Event, combined with 'kick' or 'Type' appearing under Event 2. So in this case, 'place run' wouldn't satisfy the condition, even though 'place' does appear under 'Event'. When I say the first case, I only want the first case when either of those options occur before the time resets back to 0. So for the first segment, the output I would want is 27, as this is the first time value when the condition is satisfied. For the second segment, I would want 16. For the last segment, the output would be 41. (I've put asterisk surrounding the rows which meet the condition so its easy to locate them. This isn't actually present in the data.)
Time Event Event 2
0 Begin NA
23 place run
27 *Show Type*
34 *place kick*
41 good bye
42 *place kick*
0 Begin NA
11 Hat Yellow
13 Show Green
16 *place kick*
20 place hit
29 sign redeem
35 *Show Type*
0 Begin NA
5 Cream Glue
17 Show Green
18 Orange Screen
30 place hit
33 sign redeem
41 *Show Type*
0 Begin NA
...
EDIT : So far, what I'm able to do, is subset the rows that have Show Type or place kick with the following code :
Rows <- Data[(Data[,'Event'] == 'Show' & Data[,'Event 2']== 'Type') |
(Data[,'Event'] == 'place' & Data[,'Event 2']== 'kick' ),]
Where I'm struggling, is being able to reset the search for these values after Time resets back to 0. Any help will be greatly appreciated!
Upvotes: 0
Views: 124
Reputation: 263332
The &
-infix-function can be wrapped with the which
function to generate a vector of the row numbers where those conditions are met. Then follow that with [1]
to get just the first one.
df[ which(df[ , 'Event'] %in% c('place','Show') & df[ ,'Event.2'] %in% c('kick','Type') )[1], ]
Notice that I didn't leave a space between Event
and 2
, since that would have been parsed by R as two differnt symbols. The make.names
-function is used by all the read.*
functions to remove invalid punctuation from column names.
To make this process reset at each new segment, you would build a segment vector probably with something like segvec= cumsum(df$Time==0)
, and then probably use the split-apply-combine approach to get values just within the resulting subsets.
Some lightly test code:
lapply( split(dat, cumsum(dat[ ,'Time']==0)),
function(df){df[ which(df[ ,'Event'] %in% c('place','Show') &
df[ ,'Event.2'] %in% c('kick','Type') )[1], ]})
#------
$`1`
Time Event Event.2
3 27 Show Type
$`2`
Time Event Event.2
10 16 place kick
$`3`
Time Event Event.2
20 41 Show Type
dput(dat)
structure(list(Time = c(0L, 23L, 27L, 34L, 41L, 42L, 0L, 11L,
13L, 16L, 20L, 29L, 35L, 0L, 5L, 17L, 18L, 30L, 33L, 41L), Event = structure(c(1L,
6L, 7L, 6L, 3L, 6L, 1L, 4L, 7L, 6L, 6L, 8L, 7L, 1L, 2L, 7L, 5L,
6L, 8L, 7L), .Label = c("Begin", "Cream", "good", "Hat", "Orange",
"place", "Show", "sign"), class = "factor"), Event.2 = structure(c(NA,
7L, 9L, 5L, 1L, 5L, NA, 10L, 3L, 5L, 4L, 6L, 9L, NA, 2L, 3L,
8L, 4L, 6L, 9L), .Label = c("bye", "Glue", "Green", "hit", "kick",
"redeem", "run", "Screen", "Type", "Yellow"), class = "factor")), .Names = c("Time",
"Event", "Event.2"), class = "data.frame", row.names = c(NA,
-20L))
Upvotes: 3
Reputation: 78792
Far less succinct (and prbly less optimal) than 42-'s but:
library(stringi)
read.table(text="Time Event Event2
0 Begin NA
23 place run)
27 *Show Type*
34 (*place kic)k*
41 good bye
42 (*place kic)k*
0 Begin NA
11 Hat Yellow
13 Show Green
16 *place kick*
20 place hit
29 sign redeem
35 *Show Type*
0 Begin NA
5 Cream Glue
17 Show Green
18 Orange Screen
30 place hit
33 sign redeem
41 *Show Type*
0 Begin NA", header=TRUE, stringsAsFactors=FALSE) -> df
library(dplyr)
df$grp <- 0
df[which(df$Time == 0),]$grp <- 1
df$grp <- cumsum(df$grp)
group_by(df, grp) %>%
filter(grepl("place|show", Event, ignore.case=TRUE) & grepl("kick|type", Event2, ignore.case=TRUE)) %>%
slice(1) %>%
select(-grp)
## Source: local data frame [3 x 4]
## Groups: grp [3]
##
## grp Time Event Event2
## <dbl> <int> <chr> <chr>
## 1 1 27 *Show Type*
## 2 2 16 *place kick*
## 3 3 41 *Show Type*
Upvotes: 0