jq remove text before and after json

Question

Ingesting another sourcetype that provides insane json output. It starts out like:

Sep  1 15:52:26 | IdentityValidationApi |  |  |  | {"header":{"tenantId":"X03LHWE3","requestType":"  ...

and has a pipe in between the request and the response, but both are on the same line:

..."serverTime":"2017-09-01T19:52:24.641Z"}}} | {"responseHeader":{"tenantID":

and the json output ends with

...,"fieldValue":"Engineer"}]}}} | D2C CrossCore Request-Response | IdentityValidationApi.corp-dev.com | /api/Inquiry | 172.30.68.88 |  | True

I've tried jq, using jq .header[], but it hates that | in the middle of the event. End goal is to ingest the entire event into Splunk without the beginning or end text outside the json. Can someone suggest any steps here? Thank you.

Edit: I can use sed to pull out the beginning of the line, but am unsure how to combine that with removing the text from the end. Can I do that?

Jeff Mercado · Accepted Answer

jq is designed to work with json data. Your input is not pure json. If you can make certain assumptions about your input, then you can probably process the json parts. Any deviation in any of the inputs will break things.

the pipe (|) is only used as a delimiter throughout the file, kind of like a "pipe separated values" file (a la csv but with no escape sequences)
jq can consume raw files as strings, if pipes are really only used as delimiters, we don't have to worry about parsing it
data in the file does not span multiple rows and only occupies a single row
without parsing the data or assuming any patterns in the file, it will be impossible to know which lines belong to a single item and when a new one starts
your json data will always be found in a fixed column of the psv row
again, it will be impossible to know where the request or response parts are in the row if it isn't in fixed places without further processing

If these assumptions hold true, you could probably use something like this:

$ jq -R 'split("|") | {request:.[5]|fromjson,response:.[6]|fromjson}' input.psv

This should give you objects with which you could access the request and response objects. Then you can operate on these.

jq remove text before and after json

Answers (2)

Related Questions