Cyril Duchon-Doris
Cyril Duchon-Doris

Reputation: 13929

Amazon Cloudwatch Logs Insights with JSON fields

I am trying to use Logs Insights with data containing JSON in one of the fields, and to parse the JSON fields

My data looks like the following when I put it in insights with the starter code

fields @timestamp, @message
| sort @timestamp desc
| limit 25

How can I easily extract the path variable in my nested JSON to perform aggregations on it ? By looking at some documentation, I thought @message.path would work but it does not seem so. Has anyone successfully interpreted JSON logs in Insights

enter image description here

EDIT : Sample of what my data looks like

#
@timestamp
@message
1
2018-12-19 23:42:52.000
I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/ID","format":"json","controller":"API::V1::Public::ProfessionalsController","action":"show","status":200,"duration":285.27,"view":222.36,"time":"2018-12-19T23:42:52.344+00:00","params":{"include":"user,tags,promotions,company_sector,similar_professionals.tags,similar_professionals.user","format":"json","compress":false,"id":"ID"},"@timestamp":"2018-12-19T23:42:52.629Z","@version":"1","message":"[200] GET /api/v1/professionals/ID (API::V1::Public::ProfessionalsController#show)"}
@logStream  i-05d1d61ab853517a0
@message  I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/ID","format":"json","controller":"API::V1::Public::ProfessionalsController","action":"show","status":200,"duration":285.27,"view":222.36,"time":"2018-12-19T23:42:52.344+00:00","params":{"include":"xxx","format":"json","compress":false,"id":"ID"},"@timestamp":"2018-12-19T23:42:52.629Z","@version":"1","message":"[200] GET /api/v1/professionals/ID (API::V1::Public::ProfessionalsController#show)"}
@timestamp  1545262972000
2
2018-12-19 23:42:16.000
I, [2018-12-19T23:42:16.723472 #851] INFO -- : [ea712503-eb86-4a6e-ab38-ddbcd6c2b4d0] {"method":"GET","path":"/api/v1/heartbeats/new","format":"json","controller":"API::V1::Public::HeartbeatsController","action":"new","status":201,"duration":9.97,"view":3.2,"time":"2018-12-19T23:42:16.712+00:00","params":{"format":"json","compress":false},"@timestamp":"2018-12-19T23:42:16.722Z","@version":"1","message":"[201] GET /api/v1/heartbeats/new (API::V1::Public::HeartbeatsController#new)"}

Upvotes: 74

Views: 153169

Answers (6)

user1658846
user1658846

Reputation: 11

CloudWatch Logs now supports a jsonParse function allowing you to extract and parse the json string into a map/list, and then access attributes. For example, you can do

parse @message /Request details: (?<json_string>\{.*\})/
| fields jsonparse(json_string) as json_message
| stats count_distinct(json_message.`user.id`)

Refer to: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax-operations-functions.html#CWL_QuerySyntax-structure-types

example queries: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax-examples.html#CWL_QuerySyntax-examples-structure

Upvotes: 0

Neyma
Neyma

Reputation: 313

Also another parse with help of regex

Assume your @message is:

I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/"}

You can extract method by

fields @timestamp, @message
| parse @message /\"method\":\"(?<method_type>.*?)\"/

Upvotes: 9

R0B0T-B0BA
R0B0T-B0BA

Reputation: 411

Would this regex query within the parse command help you?

filter @message like / \"path\":\"/
| parse @message /(?<@endpt>((\/[a-zA-Z0-9_{}()-?]+){1,}))/

Good luck!

Upvotes: 1

pyb
pyb

Reputation: 5269

CloudWatch Insights Logs automatically discovers fields for the following log types:

Lambda logs

CloudWatch Logs Insights automatically discovers log fields in Lambda logs, but only for the first embedded JSON fragment in each log event (Note: emphasis mine). If a Lambda log event contains multiple JSON fragments, you can parse and extract the log fields by using the parse command. For more information, see Fields in JSON Logs.

CloudTrail logs

See fields in JSON logs.

Source: Supported Logs and Discovered Fields

If @message is I, [2018-12-11T13:20:27] INFO -- : {"method":"GET"}

Then you can select and filter the fields like so:

fields @timestamp, @message, method
| filter method = "GET"
| sort @timestamp desc

It works with nested fields too, i.e. params.format = "json" or results.0.firstName = "Paul".

Upvotes: 74

GusGold
GusGold

Reputation: 570

Building on @pyb insights, I was able to use parse @message '"path":"*"' as path to extract the path from any place in the @message.

You can go on to get your method by piping another parse @message '"method":"*"' as method without concern for ordering as it is a second global plain text search on @message

In the case where your @message is:

I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/ID","format":"json","controller":"API::V1::Public::ProfessionalsController","action":"show","status":200,"duration":285.27,"view":222.36,"time":"2018-12-19T23:42:52.344+00:00","params":{"include":"xxx","format":"json","compress":false,"id":"ID"},"@timestamp":"2018-12-19T23:42:52.629Z","@version":"1","message":"[200] GET /api/v1/professionals/ID (API::V1::Public::ProfessionalsController#show)"}

Using:

parse @message '"path":"*"' as path | parse @message '"method":"*"' as method

will result in the fields: path = '/api/v1/professionals/ID' and method = 'GET'

Note that this is still simply string parsing and as such, it has no concept of nested keys like params.format would not find json, however using just format would, so long as there wasn't another format string anywhere in your @message.

Also note that this is for the case where Insights is not discovering your JSON in the message. I belive this is the case that @pyb was referring to in this answer. My logs aren't being discovered either using the following format

info - Request: {"method":"POST","path":"/auth/login/","body":{"login":{"email":"[email protected]","password":"********"}},"uuid":"36d76df2-aec4-4549-8b73-f237e8f14e23","ip":"*.*.*.*"}

Upvotes: 28

pyb
pyb

Reputation: 5269

You can use the parse command to extract fields.

If @message is

I, [2018-12-11T13:20:27] INFO -- : {"method":"GET"}

Then you extract the fields like so:

fields @timestamp, @message
| parse "I, [*T*] INFO -- : {"method":"*"}" as @date, @time, @method
| filter method=GET
| sort @timestamp desc
| limit 20

The documentation is rather light for now. I am able to get results by replacing the wildcard * with a regular expression, but then the parsing fails.

Upvotes: 23

Related Questions