James
James

Reputation: 1260

kdb+/q: Fast vector update given a list of keys and values to be updated

Given a list of ids/keys and a set of corresponding values for a constant column:

q)ikeys: 1 2 3 5;
q)ivals: 100 100.5 101.5 99.5;

What is the fastest way to update the `toupd column in the following table such that the rows that match the given ikeys are updated to the new values in ivals:i.e.

q) show tab;
    ikeys | `toupd `noupd
    ------|--------------
       1  |  0.5    1
       2  |  100.5  2
       3  |  500.5  4
       4  |  400.5  8
       5  |  400.5  16
       6  |  600.5  32
       7  |  700.5  64

is updated to:

q) show restab;
    ikeys | `toupd `noupd
    ------|--------------
       1  |  100    1
       2  |  100.5  2
       3  |  101.5  4
       4  |  400.5  8
       5  |  99.5   16
       6  |  600.5  32
       7  |  700.5  64

furthermore, is there a canonical method with which one could update multiple columns in this manner.

thanks

Upvotes: 0

Views: 712

Answers (3)

terrylynch
terrylynch

Reputation: 13657

A dot amend is another approach which more easily generalises to more than one column. It can also take advantage of amend-in-place which would be the most memory efficient approach as it doesn't create a duplicate copy of the table in memory (assumes global).

ikeys:1 2 3 5
ivals:100 100.5 101.5 99.5
tab:([ikeys:1+til 7]toupd:.5 100.5 500.5 400.5 400.5 600.5 700.5;noupd:1 2 4 8 16 32 64)

q).[tab;(([]ikeys);`toupd);:;ivals]
ikeys| toupd noupd
-----| -----------
1    | 100   1
2    | 100.5 2
3    | 101.5 4
4    | 400.5 8
5    | 99.5  16
6    | 600.5 32
7    | 700.5 64

/amend in place
.[`tab;(([]ikeys);`toupd);:;ivals]

/generalise to two columns
q).[tab;(([]ikeys);`toupd`noupd);:;flip(ivals;1000 2000 3000 4000)]
ikeys| toupd noupd
-----| -----------
1    | 100   1000
2    | 100.5 2000
3    | 101.5 3000
4    | 400.5 8
5    | 99.5  4000
6    | 600.5 32
7    | 700.5 64

/you could amend in place here too
.[`tab;(([]ikeys);`toupd`noupd);:;flip(ivals;1000 2000 3000 4000)]

Upvotes: 2

cillianreilly
cillianreilly

Reputation: 2116

A dictionary is also a common method of updating values given a mapping. Indexing the dictionary with the ikeys column gives the new values and then we fill in nulls with the old toupd column values.

q)show d:ikeys!ivals
1| 100
2| 100.5
3| 101.5
5| 99.5
q)update toupd:toupd^d ikeys from tab
ikeys| toupd noupd
-----| -----------
1    | 100   1
2    | 100.5 2
3    | 101.5 4
4    | 400.5 8
5    | 99.5  16
6    | 600.5 32
7    | 700.5 64

It also worth noting that the update condition with the where clause is not guaranteed to work in all cases, e.g. if you have more mapping values than appear in your ikeys column.

q)m:ikeys:1 2 3 5 7 11
q)ivals:100 100.5 101.5 99.5 100 100
q)update toupd: ivals from tab where ikeys in m
'length

Upvotes: 1

user12704155
user12704155

Reputation:

Here are two different ways of doing it.

tab lj ([ikeys] toupd: ivals)

or

m: ikeys
update toupd: ivals from tab where ikeys in m

I'm sure there are plenty more ways. If you want to find out which is fastest for your purpose (and your data), try using q)\t:1000 yourCodeHere for large tables and see which suits you best.

As for which is the canonical way for multiple columns, I imagine it would be the update, but it's a matter of personal preference, just do whatever is fastest.

Upvotes: 2

Related Questions