JejeBelfort
JejeBelfort

Reputation: 1663

How to consistently sum lists of values contained in a table?

I have the following two tables:

t1:([]sym:`AAPL`GOOG; histo_dates1:(2000.01.01+til 10;2000.01.01+til 10);histo_values1:(til 10;5+til 10));

t2:([]sym:`AAPL`GOOG; histo_dates2:(2000.01.05+til 5;2000.01.06+til 4);histo_values2:(til 5; 2+til 4));

What I want is to sum the histo_values of each symbol across the histo_dates, such that the resulting table would look like this:

t:([]sym:`AAPL`GOOG; histo_dates:(2000.01.01+til 10;2000.01.01+til 10);histo_values:(0 1 2 3 4 6 8 10 12 9;5 6 7 8 9 12 14 16 18 14))

So the resulting dates histo_dates should be the union of histo_dates1 and histo_dates2, and histo_values should be the sum of histo_values1 and histo_values2 across dates.

EDIT:

I insist on the union of the dates, as I want the resulting histo_dates to be the union of both histo_dates1 and histo_dates2.

Upvotes: 2

Views: 141

Answers (3)

DanDan4561
DanDan4561

Reputation: 393

Another possible way using functional amend

//Column join the histo_dates* columns and get the distinct dates - drop idx
//Using a functional apply use the idx to determine which values to plus
//Join the two tables using sym as the key - Find the idx of common dates

(enlist `idx) _select sym,histo_dates:distinct each (histo_dates1,'histo_dates2),
            histovalues:{@[x;z;+;y]}'[histo_values1;histo_values2;idx],idx from 
                update idx:(where each histo_dates1 in' histo_dates2) from ((1!t1) uj 1!t2)

One possible problem with this is that to get the idx, it depends on the date columns being sorted which is usually the case.

Upvotes: 0

Sam McMillen
Sam McMillen

Reputation: 296

There are a few ways. One would be to ungroup to remove nesting, join the tables, aggregate on sym/date and then regroup on sym:

q)0!select histo_dates:histo_dates1, histo_values:histo_values1 by sym from select sum histo_values1 by sym, histo_dates1 from ungroup[t1],cols[t1]xcol ungroup[t2]
sym  histo_dates                                                                                                   histo_values
-------------------------------------------------------------------------------------------------------------------------------------------
AAPL 2000.01.01 2000.01.02 2000.01.03 2000.01.04 2000.01.05 2000.01.06 2000.01.07 2000.01.08 2000.01.09 2000.01.10 0 1 2 3 4 6  8  10 12 9
GOOG 2000.01.01 2000.01.02 2000.01.03 2000.01.04 2000.01.05 2000.01.06 2000.01.07 2000.01.08 2000.01.09 2000.01.10 5 6 7 8 9 12 14 16 18 14

A possibly faster way would be to make each row a dictionary and then key the tables on sym and add them:

q)select sym:s, histo_dates:key each v, histo_values:value each v from (1!select s, d!'v from `s`d`v xcol t1)+(1!select s, d!'v from `s`d`v xcol t2)
sym  histo_dates                                                                                                   histo_values
-------------------------------------------------------------------------------------------------------------------------------------------
AAPL 2000.01.01 2000.01.02 2000.01.03 2000.01.04 2000.01.05 2000.01.06 2000.01.07 2000.01.08 2000.01.09 2000.01.10 0 1 2 3 4 6  8  10 12 9
GOOG 2000.01.01 2000.01.02 2000.01.03 2000.01.04 2000.01.05 2000.01.06 2000.01.07 2000.01.08 2000.01.09 2000.01.10 5 6 7 8 9 12 14 16 18 14

Another option would be to use a plus join pj:

q)0!`sym xgroup 0!pj[ungroup `sym`histo_dates`histo_values xcol t1;2!ungroup `sym`histo_dates`histo_values xcol t2]
sym  histo_dates                                                                                                   histo_values
-------------------------------------------------------------------------------------------------------------------------------------------
AAPL 2000.01.01 2000.01.02 2000.01.03 2000.01.04 2000.01.05 2000.01.06 2000.01.07 2000.01.08 2000.01.09 2000.01.10 0 1 2 3 4 6  8  10 12 9
GOOG 2000.01.01 2000.01.02 2000.01.03 2000.01.04 2000.01.05 2000.01.06 2000.01.07 2000.01.08 2000.01.09 2000.01.10 5 6 7 8 9 12 14 16 18 14

See here for more on plus joins: https://code.kx.com/v2/ref/pj/

EDIT: To explicitly make sure the result has the union of the dates, you could use a union join:

q)0!`sym xgroup select sym,histo_dates,histo_values:hv1+hv2 from 0^uj[2!ungroup `sym`histo_dates`hv1 xcol t1;2!ungroup `sym`histo_dates`hv2 xcol t2]
sym  histo_dates                                                                                                   histo_values
-------------------------------------------------------------------------------------------------------------------------------------------
AAPL 2000.01.01 2000.01.02 2000.01.03 2000.01.04 2000.01.05 2000.01.06 2000.01.07 2000.01.08 2000.01.09 2000.01.10 0 1 2 3 4 6  8  10 12 9
GOOG 2000.01.01 2000.01.02 2000.01.03 2000.01.04 2000.01.05 2000.01.06 2000.01.07 2000.01.08 2000.01.09 2000.01.10 5 6 7 8 9 12 14 16 18 14

Upvotes: 4

Sean O'Hagan
Sean O'Hagan

Reputation: 1697

another way:

// rename the columns to be common names, ungroup the tables, and place the key on `sym and `histo_dates
q){2!ungroup `sym`histo_dates`histo_values xcol x} each (t1;t2)

// add them together (or use pj in place of +), group on `sym
`sym xgroup (+) . {2!ungroup `sym`histo_dates`histo_values xcol x} each (t1;t2)

// and to test this matches t, remove the key from the resulting table
q)t~0!`sym xgroup (+) . {2!ungroup `sym`histo_dates`histo_values xcol x} each (t1;t2)
1b

Upvotes: 0

Related Questions