Reputation: 4716
Let's have a table storing point-in-time (bitemporal) data:
pit:([]dt:`date$();sym:`symbol$();val:`float$();stamp:`timestamp$())
Sample data might look like:
`pit insert (2015.01.05 2015.01.06 2015.01.05;`IBM`IBM`MSFT;105.11 106.6 35.3; 2015.02.01D05:01:25.0 2015.02.01D05:01:25.0 2015.02.01D05:01:25.0)
pit
dt sym val stamp
----------------------------------------------------
2015.01.05 IBM 105.11 2015.02.01D05:01:25.000000000
2015.01.06 IBM 106.6 2015.02.01D05:01:25.000000000
2015.01.05 MSFT 35.3 2015.02.01D05:01:25.000000000
For example, on the stamp
point in time, we've recorded for IBM
symbol a value of 105.11 valid for 2015.01.05
There are new data coming in continuously and some of them are to be inserted as new records into the pit
table, but only if they convey new information. No existing records in pit
may be deleted nor updated/overwritten. I.e. we want to keep track of outdated (if any) values for auditing or authenticity purposes. Think of earnings estimate updates over time.
For example, later on we might receive:
new:([]dt:`date$();sym:`symbol$();val:`float$())
`new insert (2015.01.05 2015.01.06;`IBM`IBM;105.22 106.6)
new
dt sym val
---------------------
2015.01.05 IBM 105.22
2015.01.06 IBM 106.6
After incorporating new
information into pit
, the latter shall look like:
pit
dt sym val stamp
----------------------------------------------------
2015.01.05 IBM 105.11 2015.02.01D05:01:25.000000000
2015.01.05 IBM 105.22 2015.03.10D15:43:50.000000000
2015.01.06 IBM 106.6 2015.02.01D05:01:25.000000000
2015.01.05 MSFT 35.3 2015.02.01D05:01:25.000000000
Note the "new" 105.22
value for IBM
symbol entered the pit
with the current timestamp (2015.03.10D15:43:50
at the time of writing). Also, the 106.6
value from new
did not update the timestamp in the pit
in any way, because we had had that value already reflected in the pit
carrying older timestamp.
How to write the respective insert
statements using q
?
Note: For what it's worth, pit
is to be partitioned by additional column source
not shown here for the sake of simplicity. Also, partitions will have `g#sym
(possibly `p#sym
instead) and `s#dt
attributes.
Upvotes: 0
Views: 268
Reputation: 3969
You can use 'upsert' in following way:
reference: http://code.kx.com/q/ref/qsql/#upsert
Step 1. I am adding status column in 'new' table (otherwise it will be null for new rows in final table)
q) new:update stamp:`timestamp$.z.z from new
Step 2. Make those columns of 'new' as primary keys whose modification is criteria for addition of row. In your case it's all columns except stamp column.
q)new:except[cols new;`stamp] xkey new
Step 3: Upsert tables as (check the order: pit is on right hand side)
q) pit: 0!new upsert pit
I have removed primary keys attribute from result.
Step 4: Order the table according to stamp (or any other column).
q) pit: `stamp xasc pit
Here is one function to do all those steps:
q) myinsert:{[pit;new] `stamp xasc 0!(except[cols new;`stamp] xkey new:update stamp:`timestamp$.z.z from new) upsert pit}
Upvotes: 2