Reputation: 15
Ingesting another sourcetype that provides insane json output. It starts out like:
Sep 1 15:52:26 | IdentityValidationApi | | | | {"header":{"tenantId":"X03LHWE3","requestType":" ...
and has a pipe in between the request and the response, but both are on the same line:
..."serverTime":"2017-09-01T19:52:24.641Z"}}} | {"responseHeader":{"tenantID":
and the json output ends with
...,"fieldValue":"Engineer"}]}}} | D2C CrossCore Request-Response | IdentityValidationApi.corp-dev.com | /api/Inquiry | 172.30.68.88 | | True
I've tried jq, using jq .header[], but it hates that | in the middle of the event. End goal is to ingest the entire event into Splunk without the beginning or end text outside the json. Can someone suggest any steps here? Thank you.
Edit: I can use sed to pull out the beginning of the line, but am unsure how to combine that with removing the text from the end. Can I do that?
Upvotes: 0
Views: 3001
Reputation: 14695
While Jeff's answer pretty much sums it up, here's a specific example assembled from the sample data fragments. If the file data
contains
Sep 1 15:52:26 | IdentityValidationApi | | | | {"header":{"tenantId":"X03LHWE3"}, "serverTime":"2017-09-01T19:52:24.641Z"} | {"responseHeader":{"tenantID": "...", "fieldValue":"Engineer"}} | D2C CrossCore Request-Response | IdentityValidationApi.corp-dev.com | /api/Inquiry | 172.30.68.88 | | True
then
$ jq -M -Rc './"|" | .[5] | fromjson' data
will produce just the json fragment from column 5:
{"header":{"tenantId":"X03LHWE3"},"serverTime":"2017-09-01T19:52:24.641Z"}
This filter
$ jq -M -Rc './"|" | (.[5]|fromjson) + (.[6]|fromjson)' data
will combine the objects in columns 5 and 6 into one object:
{"header":{"tenantId":"X03LHWE3"},"serverTime":"2017-09-01T19:52:24.641Z","responseHeader":{"tenantID":"...","fieldValue":"Engineer"}}
Upvotes: 0
Reputation: 134521
jq
is designed to work with json data. Your input is not pure json. If you can make certain assumptions about your input, then you can probably process the json parts. Any deviation in any of the inputs will break things.
|
) is only used as a delimiter throughout the file, kind of like a "pipe separated values" file (a la csv but with no escape sequences)If these assumptions hold true, you could probably use something like this:
$ jq -R 'split("|") | {request:.[5]|fromjson,response:.[6]|fromjson}' input.psv
This should give you objects with which you could access the request and response objects. Then you can operate on these.
Upvotes: 1