Reputation: 33
I have these data that I want to analyze, schema as follows
timestamp price volume Qualifiers TradeCatogary
------------------- ----- ------ ---------- -------------
2016-07-11 01:00:00 3 2323 U OML
2016-07-11 01:02:03 2.5 434 K KCL
2016-07-11 01:03:34 4 3244 U KCL
2016-07-11 01:23:00 2.3 45 K OML
...
What I want is to calculate the total share, value of trade, trade count, and average price(VWAP = dollarTraded / totalVolume) of each 5min. Desired schema as follows
timestamp tradeCount totalVolume dollarTraded VWAP
--------- ---------- ----------- ------------ ----
This isn't hard and I managed it with SQL like this
SELECT
TIMESTAMP_SECONDS(CAST((ROUND(UNIX_SECONDS(Date_Time) / 300) * 300) AS int64)) AS interval_alias1,
count(*) AS cnt,
SUM(Volume) AS ShareVolumeTraded,
SUM(Price * Volume) AS DollarVolumeTraded,
(SUM(Price * Volume) / SUM(Volume)) AS VWAP
FROM
`dbd-sdlc-prod.HKG_NORMALISED.HKG_NORMALISED`
WHERE
RIC = '1606.HK'
AND Type="Trade"
AND (Date_Time BETWEEN TIMESTAMP('2016-07-11 00:00:00.000000') AND
TIMESTAMP('2016-07-11 23:59:59.999999'))
AND Volume >0
AND Price >0
GROUP BY
ROUND(UNIX_SECONDS(Date_Time) / 300), interval_alias1
ORDER BY interval_alias1
However, I want to push my analyze a bit further, I want not just a total summary of trades, I'd like to have more specific analyze based on different combination of qualifiers and tradeCatogary, such as:
count_U volume_U dollar_U VWAP_U count_U_OML volume_U_OML dollar_U_OML VWAP_U_OML ....
there are two qualifiers and two catogaries in the example so there will be:
3(basic) + 2 * 3(only qualifier) + 3 * 2 * 2(qualifier and catogary combination) = 21 columns
If there are no such trade in these time slot, the value will be left 0 as default
I wish to manage these queries with one SQL execution, I'm not sure how to address it, please give me a hint or some, thanks in advance
Upvotes: 0
Views: 540
Reputation: 812
As mentioned in the comments, Group By Rollup is appropriate for your case. Group By Rollup returns the results of Group By on all possible combinations of mentioned columns. Since you also want the output as columns rather than rows, you can use pivot tables to change rows to columns. Also for your other requirement i.e. to concatenate the outputs of two columns and make it as a single column, CONCAT function can be used.
Upvotes: 1