Reputation: 311
So essentially I'd like to group by two columns, like this:
CREATE TABLE foo AS
SELECT a, b, SUM(a)
FROM whatever
GROUP BY a, b
whatever
is a stream in Kafka format. When I issue the command, ksql returns:
Key format does not support schema.
format: KAFKA
schema: Persistence{columns=[`a` STRING KEY, `b` STRING KEY], features=[]}
reason: The 'KAFKA' format only supports a single field. Got: [`a` STRING KEY, `b` STRING KEY]
Caused by: The 'KAFKA' format only supports a single field. Got: [`DEVICE`
a KEY, `b` STRING KEY]
The problem is that the Kafka format does not support multi-column keys. Is there a way to workaround this, e. g. creating an artificial key in this table? I did not manage to do this.
I saw someone posted a similiar question and the answer seemed to work. I suggest that's because of the format. https://stackoverflow.com/a/50193239/9632703
The documentation mentions that multi-column aggregations might not work, though also saying that ksql does a background workaround to make it work. Unfortunately ksql only returns the given error message. https://www.confluent.de/blog/ksqldb-0-10-updates-key-columns/#multi-column-aggregations
The funny part is that omitting the first line CREATE TABLE foo AS
works. So if some data comes in, the aggregation works. But that's not persistent, of course. If nothing else works, I would also be fine with having a table without a primary key defined, if possible in ksql, since I could still identify the data with {a, b} in my application.
Can someone help me? Thank you.
Upvotes: 2
Views: 4368
Reputation: 32110
You can do this if you upgrade to ksqlDB 0.15. This version introduced multi-key support. You'll need to use a KEY_FORMAT
that supports it.
ksql> CREATE TABLE FOO AS SELECT A, B, SUM(C) FROM TEST_STREAM GROUP BY A,B;
Message
-----------------------------------
Created query with ID CTAS_FOO_53
-----------------------------------
ksql> DESCRIBE FOO;
Name : FOO
Field | Type
---------------------------------------------
A | BIGINT (primary key)
B | VARCHAR(STRING) (primary key)
KSQL_COL_0 | DOUBLE
---------------------------------------------
For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>;
ksql> SELECT * FROM FOO EMIT CHANGES LIMIT 5;
+---------------------------+---------------------------+---------------------------+
|A |B |KSQL_COL_0 |
+---------------------------+---------------------------+---------------------------+
|220071000 |AIS |0.4 |
|257838000 |AIS |6.2 |
|538007854 |AIS |22.700000000000003 |
|257487000 |AIS |2.4 |
|257601800 |AIS |5.8999999999999995 |
Limit Reached
Query terminated
Upvotes: 5