PKS
PKS

Reputation: 431

How to query distinct from AWS log insights

I need to query data from lambda using AWS Cloudwatch log insights. The query syntax provide by aws doesn't have distinct.

Only support (count_distinct(fieldname))

ref. https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html

Example data

Column # @timestamp @ message

1 2020-02-17T13:33:29.049+07:00 [INFO] 2020 Partition key: ABC12345_A_

2 2020-02-17T11:32:29.049+07:00 [INFO] 2020 Partition key: ABC12345_B_

3 2020-02-17T11:31:29.049+07:00 [INFO] 2020 Partition key: ABC12345_B_

4 2020-02-17T11:30:29.049+07:00 [INFO] 2020 Partition key: ABC12345_C_

5 2020-02-17T11:29:29.049+07:00 [INFO] 2020 Partition key: ABC12345_A_

Expected result

1 2020-02-17T13:33:29.049+07:00 [INFO] 2020 Partition key: ABC12345_A_

2 2020-02-17T11:32:29.049+07:00 [INFO] 2020 Partition key: ABC12345_B_

4 2020-02-17T11:30:29.049+07:00 [INFO] 2020 Partition key: ABC12345_C_

If usage normal SQL syntax look like below.

select distinct(uuid) as uuid, max(time) as time from table_name group by uuid order by time desc

Upvotes: 43

Views: 68097

Answers (3)

Tom
Tom

Reputation: 2611

You can now use the dedup keyword to remove duplicates.

fields @timestamp, @message, order_id
| sort @timestamp desc 
| dedup order_id

From what I can tell, it needs to be the last operator in the query.

Upvotes: 22

sunvenka
sunvenka

Reputation: 488

You can use

| stats count(*) by fieldname

This allows to list the distinct values in fieldname.

Upvotes: 36

easywaru
easywaru

Reputation: 1153

You can use Non-Aggregation Functions in the Stats Command like below

stats latest(@timestamp) as @latestTimestamp by @message
| display @latestTimestamp, @message

Upvotes: 19

Related Questions