Brendan Sigale
Brendan Sigale

Reputation: 11

Pinot - Converting a String to traversable JSON

I've got a string coming in from a Kafka stream that looks a lot like the Usage Example of the JSONFORMAT documentation.

What I'm struggling with now is how to convert it into a traversable JSON, such that I can, in the example from the documentation, put the "age" value into a column in my table.

I've tried the following:

jsonPath(JSONFORMAT(meta), '$.age')"

jsonPath(TOJSONMAPSTR(meta), '$.age')"

which are combinations of the JSONFORMAT and jsonPath functions.

I've validated that the jsonPath function works as expected when the data from my Kafka stream is actually formatted as a JSON, i.e. "jsonPath(source, '$.init_sync_state')", so I'm not suspecting that to be the root cause of my troubles.

I'd expect this combination of functions to first convert my string into a JSON, and then extract the value from the specific path. However, it seems that JSONFORMAT isn't actually converting the data into a JSON. I've also tried TOJSONMAPSTR but it yielded the same result.

Is there a way to convert a string into JSON structure in Pinot? Am I missing something obvious?

Any help is greatly appreciated. Thank you!

Upvotes: 0

Views: 421

Answers (2)

fxx
fxx

Reputation: 131

One more thing to add here is to set the maxLength field in data schema for the string column. (Ref: https://docs.pinot.apache.org/configuration-reference/schema#advanced-fields)

This is to ensure pinot indexes the full json string, otherwise the string might be truncated and the json function will fail for parsing error.

Upvotes: 0

Brendan Sigale
Brendan Sigale

Reputation: 11

Solved:

The solution was to use a jsonIndex which flattens out the JSON into a STRING column, and then use jsonPath to access it.

I.e. in the example from above, we'd do:

"jsonIndexConfigs": {
   "meta": {
       "maxLevels": 1,
       "excludeArray": false,
       "disableCrossArrayUnnest": true,
       "includePaths": null,
       "excludePaths": null,
       "excludeFields": null
       }
}

and then insert age into my column using a transformation column that looks like this:

{
   "columnName": "age",
   "transformFunction": "jsonPath(document, '$.age')"
},

Upvotes: 1

Related Questions