Reputation: 909
I'm working with tick data and would like to have some basic information about the distribution of the change in tick prices. My database is made of tick data during a period of 10 open days. I've taken the first difference of the tick prices :
Tick spread
2010-02-02 08:00:04 -1
2010-02-02 08:00:04 1
2010-02-02 08:00:04 0
2010-02-02 08:00:04 0
2010-02-02 08:00:04 0
2010-02-02 08:00:04 -1
2010-02-02 08:00:05 1
2010-02-02 08:00:05 1
I've created an array which provides me with the first and last tick of each day :
Open Close
[1,] 1 59115
[2,] 59116 119303
[3,] 119304 207300
[4,] 207301 351379
[5,] 351380 426553
[6,] 426554 516742
[7,] 516743 594182
[8,] 594183 683840
[9,] 683841 754962
[10,] 754963 780725
I would like to know each day the empirical distribution of my tick spreads. I know that I can use the R function table() but the problem is that it gives me a table object which length varies with days. The second problem is that some day I can have spreads of 3 points whereas the days after I only have spreads less than 3 points.
first day table() output :
-3 -2 -1 0 1 2 3
1 19 6262 46494 6321 16 2
second day table() output :
-2 -1 0 1 2 3 5
27 5636 48902 5588 33 1 1
What I would like is to create a data frame with all table()'s output for my whole tick sample. Any idea? thanks
Upvotes: 2
Views: 731
Reputation: 176648
Just use a 2-dimensional table, using as.Date(index(x))
as the rows:
# create some example data
set.seed(21)
p <- sort(runif(6))*(1:6)^2
p <- c(p,rev(p)[-1])
p <- p/sum(p)
P <- sample(-5:5, 1e5, TRUE, p)
x <- .xts(P, (1:1e5)*5)
# create table
table(as.Date(index(x)), x)
# x
# -5 -4 -3 -2 -1 0 1 2 3 4 5
# 1970-01-01 22 141 527 1623 2968 6647 2953 1700 538 139 21
# 1970-01-02 31 142 548 1596 2937 6757 2874 1677 529 167 22
# 1970-01-03 26 172 547 1599 2858 6814 2896 1681 504 163 20
# 1970-01-04 23 178 537 1645 2855 6805 2891 1626 537 165 18
# 1970-01-05 23 139 490 1597 3028 6740 2848 1724 505 158 28
# 1970-01-06 21 134 400 1304 2266 5496 2232 1213 397 112 26
Upvotes: 2
Reputation: 301
If you want the frequency distribution for the entire 10 day period just concatenate the data and do the same. Is that what you want to do?
Upvotes: 0