Terry
Terry

Reputation: 519

Replay tplog in kdbb

I tried to replay tplog in kdb. The data is about 1gb, but it takes the whole day and still cannot finish replaying.

One of the tables has two keys and both are string. I am not sure if this is the reason why it takes forever. I also use upsert instead of insert. What should I change to make it replay faster?

What I think is to change upd function. so when load to memory, those two columns change to symbol first.

Upvotes: 0

Views: 364

Answers (1)

terrylynch
terrylynch

Reputation: 13657

As discussed in comments, if the purpose for using a keyed table with upserting was to have a unique row per key(s) then it may be more efficient (when it comes to log replay) to start with an unkeyed table and set upd to insert and when the replay is complete then group the table based on the keys, for example

`keyCol1`keyCol2 xkey select from replayedTable where i=(last;i) fby ([]keyCol1;keyCol2)

Note that insert cannot be used with a keyed table as it does not allow an insert when the key already exists in the table.

The reason the original upsert on the keyed table was slow was possibly due to the number of keys and the inefficiency of looking up the key list (specifically of strings) on each upsert. This is just speculation though

Upvotes: 1

Related Questions