40339109
40339109

Reputation: 7

Parsing out awkward JSON in Logstash

Afternoon,

I've been trying to sort this for the past few weeks and cannot find a solution. We receive some logs via a 3rd part and so far I've used grok to pull out the value below into the details field. Annoyingly this would be extremely simple if it weren't for the all the slashes.

Is there an easy way to parse this data out as JSON in Logstash?

{\"CreationTime\":\"2021-05-11T06:42:44\",\"Id\":\"xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx\",\"Operation\":\"SearchMtpBatch\",\"OrganizationId\":\"xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx\",\"RecordType\":52,\"UserKey\":\"[email protected]\",\"UserType\":5,\"Version\":1,\"Workload\":\"SecurityComplianceCenter\",\"UserId\":\"[email protected]\",\"AadAppId\":\"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx\",\"DataType\":\"MtpBatch\",\"DatabaseType\":\"DataInsights\",\"RelativeUrl\":\"/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx\",\"ResultCount\":\"1\"}

Upvotes: 0

Views: 802

Answers (2)

tomr
tomr

Reputation: 563

If your source data actually contains those backslashes, then you need to somehow remove them before Logstash can recognise the message as valid JSON.

You could do that before it hits Logstash, then the json codec will probably work as expected. Or if you want Logstash to handle it, you can use the Mutate's gsub option, followed by the JSON filter to parse:

filter {
  mutate {
    gsub => ["message", "[\\]", "" ]
  }
  json {
    source => "message"
  }
}

A couple of things to note: this will just blindly strip out all backslashes. If your strings ever might contain backslashes, you need to do something a little more sophisticated. I've had trouble escaping backslashes in gsub before and found that using the regex any of/[] construction is safer.

Here's a docker one-liner to run that config. The stdin input and stdout output are the default when using -e to specify config on the command line, so I've omitted them here for readability:

docker run --rm -it docker.elastic.co/logstash/logstash:7.12.1 -e 'filter { mutate { gsub => ["message", "[\\]", "" ]} json { source => "message" } }'

Pasting your example in and hitting return results in this output:

{
        "@timestamp" => 2021-05-13T01:57:40.736Z,
       "RelativeUrl" => "/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx",
    "OrganizationId" => "xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx",
           "UserKey" => "[email protected]",
          "DataType" => "MtpBatch",
           "message" => "{\"CreationTime\":\"2021-05-11T06:42:44\",\"Id\":\"xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx\",\"Operation\":\"SearchMtpBatch\",\"OrganizationId\":\"xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx\",\"RecordType\":52,\"UserKey\":\"[email protected]\",\"UserType\":5,\"Version\":1,\"Workload\":\"SecurityComplianceCenter\",\"UserId\":\"[email protected]\",\"AadAppId\":\"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx\",\"DataType\":\"MtpBatch\",\"DatabaseType\":\"DataInsights\",\"RelativeUrl\":\"/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx\",\"ResultCount\":\"1\"}",
          "UserType" => 5,
            "UserId" => "[email protected]",
              "type" => "stdin",
              "host" => "de2c988c09c7",
          "@version" => "1",
         "Operation" => "SearchMtpBatch",
          "AadAppId" => "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx",
       "ResultCount" => "1",
      "DatabaseType" => "DataInsights",
           "Version" => 1,
        "RecordType" => 52,
      "CreationTime" => "2021-05-11T06:42:44",
                "Id" => "xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx",
          "Workload" => "SecurityComplianceCenter"
}

Upvotes: 0

Val
Val

Reputation: 217254

You can achieve this easily with the json filter:

filter {
  json {
    source => "message"
  }
}

Upvotes: 1

Related Questions