Reputation: 11
I've got a string coming in from a Kafka stream that looks a lot like the Usage Example of the JSONFORMAT documentation.
What I'm struggling with now is how to convert it into a traversable JSON, such that I can, in the example from the documentation, put the "age" value into a column in my table.
I've tried the following:
jsonPath(JSONFORMAT(meta), '$.age')"
jsonPath(TOJSONMAPSTR(meta), '$.age')"
which are combinations of the JSONFORMAT and jsonPath functions.
I've validated that the jsonPath
function works as expected when the data from my Kafka stream is actually formatted as a JSON, i.e. "jsonPath(source, '$.init_sync_state')"
, so I'm not suspecting that to be the root cause of my troubles.
I'd expect this combination of functions to first convert my string into a JSON, and then extract the value from the specific path. However, it seems that JSONFORMAT isn't actually converting the data into a JSON. I've also tried TOJSONMAPSTR but it yielded the same result.
Is there a way to convert a string into JSON structure in Pinot? Am I missing something obvious?
Any help is greatly appreciated. Thank you!
Upvotes: 0
Views: 421
Reputation: 131
One more thing to add here is to set the maxLength
field in data schema for the string column. (Ref: https://docs.pinot.apache.org/configuration-reference/schema#advanced-fields)
This is to ensure pinot indexes the full json string, otherwise the string might be truncated and the json function will fail for parsing error.
Upvotes: 0
Reputation: 11
Solved:
The solution was to use a jsonIndex which flattens out the JSON into a STRING
column, and then use jsonPath
to access it.
I.e. in the example from above, we'd do:
"jsonIndexConfigs": {
"meta": {
"maxLevels": 1,
"excludeArray": false,
"disableCrossArrayUnnest": true,
"includePaths": null,
"excludePaths": null,
"excludeFields": null
}
}
and then insert age into my column using a transformation column that looks like this:
{
"columnName": "age",
"transformFunction": "jsonPath(document, '$.age')"
},
Upvotes: 1