Reputation: 431
I need to query data from lambda using AWS Cloudwatch log insights. The query syntax provide by aws doesn't have distinct.
Only support (count_distinct(fieldname))
ref. https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html
Example data
Column # @timestamp @ message
1 2020-02-17T13:33:29.049+07:00 [INFO] 2020 Partition key: ABC12345_A_
2 2020-02-17T11:32:29.049+07:00 [INFO] 2020 Partition key: ABC12345_B_
3 2020-02-17T11:31:29.049+07:00 [INFO] 2020 Partition key: ABC12345_B_
4 2020-02-17T11:30:29.049+07:00 [INFO] 2020 Partition key: ABC12345_C_
5 2020-02-17T11:29:29.049+07:00 [INFO] 2020 Partition key: ABC12345_A_
Expected result
1 2020-02-17T13:33:29.049+07:00 [INFO] 2020 Partition key: ABC12345_A_
2 2020-02-17T11:32:29.049+07:00 [INFO] 2020 Partition key: ABC12345_B_
4 2020-02-17T11:30:29.049+07:00 [INFO] 2020 Partition key: ABC12345_C_
If usage normal SQL syntax look like below.
select distinct(uuid) as uuid, max(time) as time from table_name group by uuid order by time desc
Upvotes: 43
Views: 68097
Reputation: 2611
You can now use the dedup
keyword to remove duplicates.
fields @timestamp, @message, order_id
| sort @timestamp desc
| dedup order_id
From what I can tell, it needs to be the last operator in the query.
Upvotes: 22
Reputation: 488
You can use
| stats count(*) by fieldname
This allows to list the distinct values in fieldname
.
Upvotes: 36
Reputation: 1153
You can use Non-Aggregation Functions in the Stats Command like below
stats latest(@timestamp) as @latestTimestamp by @message
| display @latestTimestamp, @message
Upvotes: 19