JDBC sink topic with multiple structs to postgres

Question

I am trying to sink a few topics top a postgres database. However the topic schema defines a array at the top level and within it multiple structs. Automapping does not work and I cannot find any reference how to handle this. I need all structs because they are dependent types, the second struct references the first struct as a field.

Currently it breaks when hitting the 2nd struct stating statusChangeEvent (struct) has no mapping to sql column type. This because it is using auto.create to make a table (probably called ProcessStatus) then when hitting the second entry there is no column of course.

[
    {
        "type": "record",
        "name": "processStatus",
        "namespace": "company.some.process",
        "fields": [
            {
                "name": "code",
                "doc": "The code of the processStatus",
                "type": "string"
            },
            {
                "name": "name",
                "doc": "The name of the processStatus",
                "type": "string"
            },
            {
                "name": "description",
                "type": "string"
            },
            {
                "name": "isCompleted",
                "type": "boolean"
            },
            {
                "name": "isSuccessfullyCompleted",
                "type": "boolean"
            }
        ]
    },
    {
        "type": "record",
        "name": "StatusChangeEvent",
        "namespace": "company.some.process",
        "fields": [
            {
                "name": "contNumber",
                "type": "string"
            },
            {
                "name": "processId",
                "type": "string"
            },
            {
                "name": "processVersion",
                "type": "int"
            },
            {
                "name": "extProcessId",
                "type": [
                    "null",
                    "string"
                ],
                "default": null
            },
            {
                "name": "fromStatus",
                "type": "process.status"
            },
            {
                "name": "toStatus",
                "doc": "The new status of the process",
                "type": "company.some.process.processStatus"
            },
            {
                "name": "changeDateTime",
                "type": "long",
                "logicalType": "timestamp-millis"
            },
            {
                "name": "isPublic",
                "type": "boolean"
            }
        ]
    }
]

I am not using ksql atm. Which connector settings are suited for this task? If there is a ksql alternative it would be nice to know but the current requirement is to use the JDBC connector.

I tried using flatten but it does not support struct fields that have a schema. Which seems kind of weird. Aren't schema's the whole selling point of connect with kafka? Or is it more of a constraint you have to work around?

OneCricketeer · Accepted Answer

Aren't schema's the whole selling point of connect with kafka?

Yes, but Postgres (or the JDBC Sink, in general) doesn't really support nested objects within columns. For that, you're better off with a document database, such as using Mongo Sink Connector.

Which connector settings are suited for this task?

None, really, other than transforms. You could write your own if flatten doesn't work.

You could try pre-defining your table to use JSONB for the two status columns, however, that's more of a workaround.

JDBC sink topic with multiple structs to postgres

Answers (1)

Related Questions