Reputation:
I have some log data structured like:
{
timestamp: mm/dd/yyy hh:mm:ss
customDimensions : {
location: "locationA",
subLocation: "subLocationB",
dataName: "dataNameB",
data: "value"
}
}
I need to return the latest record for each "dataName" by location/subLocation.
e.g.
locationA sublocationB dataName value
locationA sublocationC dataName value
locationA sublocationB dataname2 value
Basically perform a "group by" on fields: location,sublocation, and dataName where max(timestamp)
How can I achieve this with Kusto as there is no group by support and the summarize function seems to require an aggregate function?
Upvotes: 2
Views: 9043
Reputation: 51
One possible solution is:
<name of the table>
| summarize arg_max(customDimensions.dataName) by location, subLocation
you are supposed to have extracted a column that contains the "customDimensions" data and that is dynamic.
P.S. Since the column is dynamic, before you can run arg_max() you must cast the reference data type for that field.
Upvotes: 0
Reputation: 25895
You could try extending the dynamic properties into calculated columns, then summarizing using arg_max()
to get the latest (according to timestamp
) by location
, subLocation
and dataName
.
For example:
datatable(d:dynamic)[
dynamic({
"timestamp": "2020-05-12 20:32:08",
"customDimensions" : {
"location": "locationA",
"subLocation": "subLocationB",
"dataName": "dataNameB",
"data": "value"
}
}), dynamic({
"timestamp": "2020-05-12 20:35:08",
"customDimensions" : {
"location": "locationA",
"subLocation": "subLocationB",
"dataName": "dataNameB",
"data": "value"
}
}), dynamic({
"timestamp": "2020-05-12 20:25:08",
"customDimensions" : {
"location": "locationB",
"subLocation": "subLocationB",
"dataName": "dataNameC",
"data": "value"
}
})
]
| project timestamp = todatetime(d.timestamp),
location = tostring(d.customDimensions.location),
subLocation = tostring(d.customDimensions.subLocation),
dataName = tostring(d.customDimensions.dataName), data = tostring(d.customDimensions.data)
| summarize arg_max(timestamp, *) by location, subLocation, dataName
Upvotes: 2