Reputation: 96
We are ingesting simple device readings on regular intervals into adx. However, 0.28% of the ingested readings may be later updated with new reading values. This can be rectified with a simple Materialized View and arg_max. However, we also want to optimize summaries of readings per day, month and year. The problem being, I cannot make a materialized view on top of a materialized view. It is also not possible to make a materialized view with two summary clauses.
Example:
.create table Readings (Timestamp:datetime, DeviceName:string, IngestTime:datetime, Reading:decimal)
.ingest inline into table Readings <|
"2022-10-31 23:00:00.0000000", "EX", "2022-11-06 11:02:29.5690000",0.733
"2022-10-31 22:00:00.0000000", "EX", "2022-11-05 11:05:36.4230000",0.763
"2022-10-31 22:00:00.0000000", "EX", "2022-11-01 07:08:55.9580000",0.5
"2022-10-31 22:00:00.0000000", "EX", "2022-11-02 11:04:42.7050000",0.5
"2022-10-31 21:00:00.0000000", "EX", "2022-11-05 11:05:36.4230000",0.856
"2022-10-31 20:00:00.0000000", "EX", "2022-11-05 11:05:36.4230000",0.827
"2022-10-31 20:00:00.0000000", "EX", "2022-11-02 11:04:42.7050000",0
"2022-10-31 20:00:00.0000000", "EX", "2022-11-01 07:08:55.9580000",0
"2022-10-31 19:00:00.0000000", "EX", "2022-11-05 11:05:36.4230000",0.935
.create materialized-view with (backfill=true, DocString="Only latest measures by arg_max",
effectiveDateTime=datetime(2019-01-01),
MaxSourceRecordsForSingleIngest=10000000,
Concurrency=5
) ReadingsLatest on table Readings { Readings
| summarize arg_max(IngestTime, *) by DeviceName, Reading }
gives output
2022-10-31T23:00:00Z "EX" 2022-11-06T11:02:29.569Z 0.733
2022-10-31T22:00:00Z "EX" 2022-11-05T11:05:36.423Z 0.763
2022-10-31T21:00:00Z "EX" 2022-11-05T11:05:36.423Z 0.856
2022-10-31T20:00:00Z "EX" 2022-11-05T11:05:36.423Z 0.827
2022-10-31T19:00:00Z "EX" 2022-11-05T11:05:36.423Z 0.935
The problem is performance when we perform aggregate queries:
ReadingsLatest
| summarize Reading_Day = sum(Reading) by Day = startofday(datetime_utc_to_local(Timestamp, 'Europe/Oslo')), DeviceName
So we would like a Materialized view on that, but we cannot figure out how:
.create materialized-view with (backfill=true, DocString="Only latest measures by arg_max",
effectiveDateTime=datetime(2019-01-01),
MaxSourceRecordsForSingleIngest=10000000,
Concurrency=5
) ReadingsLatestDay on materialized-view ReadingsLatest { ReadingsLatest
| summarize Reading_Day = sum(Reading) by Day = startofday(datetime_utc_to_local(Timestamp, 'Europe/Oslo')), DeviceName }
//fails with:
Cannot create materialized view 'ReadingsLatestDay': Materialized view can only be created on top of another materialized view which includes a single any()/anyif()/take_any()/take_anyif() aggregation.
Trying both at the same time:
.create materialized-view with (backfill=true, DocString="Only latest measures by arg_max",
effectiveDateTime=datetime(2019-01-01),
MaxSourceRecordsForSingleIngest=10000000,
Concurrency=5
) ReadingsLatestDay on table Readings { Readings
| summarize arg_max(IngestTime, *) by DeviceName, Reading
| summarize Reading_Day = sum(Reading) by Day = startofday(datetime_utc_to_local(Timestamp, 'Europe/Oslo')), DeviceName }
//fails with:
Cannot create materialized view 'ReadingsLatestDay': Materialized views query can only include a single summarize operator over the source table.
Another option we have considered is deleting the Readings that have been updated in a later ingest, but this also seems to be difficult. We cannot figure out the syntax.
Upvotes: 3
Views: 286
Reputation: 2744
The errors you are getting are expected - materialized view over materialized view is only supported when the first view is of type take_any(*)
- see docs here. More than a single aggregation in the same materialized view also isn't supported. There is no build in way to pre-compute both of these aggregates in ADX. You can create the first materialized view (arg_max()
) and then orchestrate your own pipeline, using any of the orchestration tools, that periodically queries the materialized view and persists the daily aggregates, using ingest from query commands.
Upvotes: 2