Reputation: 77
I have a dataframe with two columns: Tick and Bid. This is tick data for a financial instrument. If the Tick column has a value price has moved up/down. The Bid column holds the current bid price. Here's an example:
Test <- data.frame(Tick=c(NA, NA, NA ,NA, NA , "1", NA, NA, NA,
NA, NA, "-1",NA, NA),
Bid=c("393.75", "393.75" ,"393.75" ,"394", "394",
NA,"394" ,"394","394","394","393.75",
NA, "393.75", "393.75"))
I want to create a variable I'll call Best_Bid_Ex
that holds the value of the first price after a tick event occurs (is.na(Test$Tick == FALSE)
) and resets with the new bid price with the next event.
Test$Best_Bid_Ex <- c("393.75", "393.75" ,"393.75" ,"393.75",
"393.75", NA,"394" ,"394","394","394","394", NA, "393.75", "393.75")
I tried different indexing techniques but I can't get anything to work.
Upvotes: 2
Views: 54
Reputation: 887901
In the devel version of 'data.table', there is a convenient function rleid
for generating a run-length_type_id
column for grouping purpose.
We convert the 'data.frame' to 'data.table' (setDT(Test)
). Apply rleid
on the logical index (!is.na(Tick)
) to get numeric group for each set of 'TRUE/FALSE/TRUE' groups and then create the 'Best_Bid_Ex' as the first value of 'Bid' column (Bid[1L]
) grouped by the numeric grouping from rleid
.
library(data.table)#v1.9.5+
setDT(Test)[, Best_Bid_Ex:= Bid[1L], rleid(!is.na(Tick))]
Test
# Tick Bid Best_Bid_Ex
# 1: NA 393.75 393.75
# 2: NA 393.75 393.75
# 3: NA 393.75 393.75
# 4: NA 394 393.75
# 5: NA 394 393.75
# 6: 1 NA NA
# 7: NA 394 394
# 8: NA 394 394
# 9: NA 394 394
#10: NA 394 394
#11: NA 393.75 394
#12: -1 NA NA
#13: NA 393.75 393.75
#14: NA 393.75 393.75
Instructions to install the devel version are here
Upvotes: 3