CuriousMind
CuriousMind

Reputation: 15798

kdb/q -- how to number the rows by certain groupings

I have tables with date;sym columns. But each date might have multiple syms. I want to number the occurrences of symbol in each date

For example:

date        sym 
-------------------
2019.06.04  ABC
2019.06.04  DEF
2019.06.04  ABC
2019.06.05  DEF
2019.06.05  ABC

will give me

date        sym   c 
-------------------
2019.06.04  ABC   1
2019.06.04  DEF   1
2019.06.04  ABC   2   / here ABC appears for the second time on this date. 
2019.06.05  DEF   1
2019.06.05  ABC   1

Upvotes: 0

Views: 1437

Answers (2)

Callum Biggs
Callum Biggs

Reputation: 1540

To count the occurrences of syms by date across all of the tables in a HDB we can run a count by date for each of the partitioned tabled .Q.pt and then scan that over pj plus join, as each table is keyed on date (matching keys). As pj is similar to an ij we need to ensure that there are no rows dropped as each date might be missing different syms

q)cntTabs:{2!0!update c:count each sym,sym:first each sym from select sym by date from x} each .Q.pt
q){t:pj[x;y];t,k!y k:key[y] except key[t]}/[cntTabs]

Upvotes: 0

Phil Gunning
Phil Gunning

Reputation: 222

This may be a little cleaner, here the c column is just a running sum of all rows that have been grouped by each combination of date and sym.

q)t:([]date:2019.06.04+0 0 0 1 1;sym:`ABC`DEF`ABC`DEF`ABC)

q)update c:sums i=i by date,sym from t

date       sym c
----------------
2019.06.04 ABC 1
2019.06.04 DEF 1
2019.06.04 ABC 2
2019.06.05 DEF 1
2019.06.05 ABC 1

Upvotes: 5

Related Questions