three-sigma
three-sigma

Reputation: 7

Conditional update of column

I have a trades table comprised of date, time, symbol, volume, closing auction volume.

Unfortunately the closing auction volume is including in the last bin of the table, and to complicate matters it can either occur at 15:59:00 or at 12:59:00 during half days.

Is there a way to conditionally update the volume to remove the closing volume?

e.g remove it from 15:59:00's volume if it exists, otherwise remove it from 12:59:00's volume.

For example

t:([]date:2019.02.01 2019.02.01 2019.02.02 2019.02.02;time:12:59:00 15:59:00 12:59:00 15:59:00;sym:`AAPL`AAPL`AAPL`AAPL;volume:100 25000 26000 0; closingvol: 24000 24000 21000 21000) 

**date**|**time**|**sym**|**volume**|**closingvol**
:-----:|:-----:|:-----:|:-----:|:-----:
2019-02-01|12:59:00|AAPL|100|24000
2019-02-01|15:59:00|AAPL|25000|24000
2019-02-02|12:59:00|AAPL|26000|21000
2019-02-02|15:59:00|AAPL|0|21000

I would like to be

**date**|**time**|**sym**|**volume**|**closingvol**
:-----:|:-----:|:-----:|:-----:|:-----:
2019-02-01|12:59:00|AAPL|100|24000
2019-02-01|15:59:00|AAPL|1000|24000
2019-02-02|12:59:00|AAPL|5000|21000
2019-02-02|15:59:00|AAPL|0|21000

I was hoping I could get away with the below, but the "or" appearently doesn't behave as I was hoping it would as it never modifies the second 12:59:00 entry.

update volume:volume-closingvol from t where (time=15:59:00 | time=12:59:00), volume>=closingvol

**date**|**time**|**sym**|**volume**|**closingvol**
:-----:|:-----:|:-----:|:-----:|:-----:
2019-02-01|12:59:00|AAPL|100|24000 
2019-02-01|15:59:00|AAPL|1000|24000
2019-02-02|12:59:00|AAPL|26000|21000
2019-02-02|15:59:00|AAPL|0|21000

** UPDATE 1**

As suggested I also tried:

update volume:volume-closingvol from t where (time=15:59:00)|time=12:59:00, volume>=closingvol

e.g. As seen below ALL volumes are updated. I would have expected that on 2019.02.01 only the 15:59:00 for AAPL, and 12:59:00 for BAC would be updated, and on 2019.02.02 only the 15:59:00 for BAC and 12:59:00 for AAPL, but this modified all instances of 12:59:00 and 15:59:00.

t:([]date:2019.02.01 2019.02.01 2019.02.02 2019.02.02 2019.02.02 2019.02.02 2019.02.01 2019.02.01;time:12:59:00 15:59:00 12:59:00 15:59:00 12:59:00 15:59:00 12:59:00 15:59:00;sym:`AAPL`AAPL`AAPL`AAPL`BAC`BAC`BAC`BAC;volume:100 25000 26000 0 20000 12000 13000 0; closingvol: 24000 24000 21000 21000 11000 11000 12000 12000)

t:`date`time xasc t

update volume:volume-closingvol from t where (time=15:59:00)|(time=12:59:00), volume>=closingvol


**date**|**time**|**sym**|**volume**|**closingvol**
:-----:|:-----:|:-----:|:-----:|:-----:
2019-02-01|12:59:00|AAPL|100|24000
2019-02-01|12:59:00|BAC|1000|12000
2019-02-01|15:59:00|AAPL|1000|24000
2019-02-01|15:59:00|BAC|0|12000
2019-02-02|12:59:00|AAPL|5000|21000
2019-02-02|12:59:00|BAC|9000|11000
2019-02-02|15:59:00|AAPL|0|21000
2019-02-02|15:59:00|BAC|1000|11000

Upvotes: 0

Views: 342

Answers (3)

Rahul
Rahul

Reputation: 3969

Your example shows that volume could be bigger than closing volume for both times - (12:59:00 15:59:00) for same date,sym pair. Thats why condition 'vol>=closingvol' is giving incorrect answer (which is used by you in your example and also used in other answers to your post).

Below solution would work based on following assumptions:

  1. All date,sym combinations have both times - (12:59:00 15:59:00).
  2. Data is ordered by time.

Even if above assumptions don't hold in actual scenario, it is very easy to change below query to work according to that.

  q) t:([]date:2019.02.01 2019.02.01 2019.02.02 2019.02.02 2019.02.02 2019.02.02 2019.02.01 2019.02.01;time:12:59:00 15:59:00 12:59:00 15:59:00 12:59:00 15:59:00 12:59:00 15:59:00;sym:`AAPL`AAPL`AAPL`AAPL`BAC`BAC`BAC`BAC;volume:100 25000 26000 0 20000 12000 13000 0; closingvol: 24000 24000 21000 21000 11000 11000 12000 12000)

  q) update volume:{?[0=x 1;(x[0]-y[0]),x 1;x[0],x[1]-y[1]]}[volume;closingvol] by date,sym from t where  time in (12:59:00 15:59:00)

or other version of same query:

  q) update volume: volume-closingvol*(0 1;1 0)0=volume 1 by date,sym from t where  time in (12:59:00 15:59:00)
date       time     sym  volume closingvol
------------------------------------------
2019.02.01 12:59:00 AAPL 100    24000     
2019.02.01 15:59:00 AAPL 1000   24000     
2019.02.02 12:59:00 AAPL 5000   21000     
2019.02.02 15:59:00 AAPL 0      21000     
2019.02.02 12:59:00 BAC  20000  11000     
2019.02.02 15:59:00 BAC  1000   11000     
2019.02.01 12:59:00 BAC  1000   12000     
2019.02.01 15:59:00 BAC  0      12000 

You could also avoid lambda function in the query by replacing x and y with volume and closingvol. I used this way to make it a bit smaller.

Upvotes: 0

Jemma Borland
Jemma Borland

Reputation: 390

The vector conditional ? might be useful here:

update volume:?[time in 12:59:00 15:59:00;volume-closingvol;volume] from t where vol>=closingvol

The first argument expects a Boolean list - created by the time in 12:59:00 15:59:00 check, and applies the first condition (removing the closingvol) where this list returns True, else applies the second condition (leaves volume as it is).

Upvotes: 0

jomahony
jomahony

Reputation: 1692

At the moment the following is occuring as q evaluates each constraint within the where clause from right to left:

q)time:12:59:00 15:59:00 12:59:00 15:59:00
q)15:59:00 | time=12:59:00
15:59:00 15:59:00 15:59:00 15:59:00

| is behaving as max in this case: http://code.kx.com/q/ref/arith-integer/#or-maximum

Just change the placement of the parentheses:

q)update volume:volume-closingvol from t where (time=15:59:00)|time=12:59:00, volume>=closingvol
date       time     sym  volume closingvol
------------------------------------------
2019.02.01 12:59:00 AAPL 100    24000     
2019.02.01 15:59:00 AAPL 1000   24000     
2019.02.02 12:59:00 AAPL 5000   21000     
2019.02.02 15:59:00 AAPL 0      21000  

Edit for secondary example -

You can utilise fby (http://code.kx.com/q/ref/qsql/#fby) which will allow you to add an additional constraint, updating the max record (of time 12:59 or 15:59) for each sym/date:

q)update volume:volume-closingvol from t where (time=15:59:00)|time=12:59:00,volume>=closingvol,time=(max;time)fby ([]date;sym)
date       time     sym  volume closingvol
------------------------------------------
2019.02.01 12:59:00 AAPL 100    24000     
2019.02.01 12:59:00 BAC  1000   12000     
2019.02.01 15:59:00 AAPL 1000   24000     
2019.02.01 15:59:00 BAC  0      12000     
2019.02.02 12:59:00 AAPL 5000   21000     
2019.02.02 12:59:00 BAC  20000  11000     
2019.02.02 15:59:00 AAPL 0      21000     
2019.02.02 15:59:00 BAC  1000   11000    

Upvotes: 1

Related Questions