Redevil
Redevil

Reputation: 21

How to validate and exclude data at ingestion time for Azure Data Explorer

I am sending data from web client to event hub and then ingest into Azure Data Explorer. The event generated by web client has a timestamp field, and when event hub receives the event, it will add a EventEnqueuedUtcTime field. Both are UTC timestamp.

Is there a way to compare the the two timestamps at Data Explorer ingestion, and exclude the data if the time difference is more than a certain vaule?

for example, if EventEnqueuedUtcTime - timestamp > x minutes, then we don't ingest this event into Data Explorer?

Upvotes: 1

Views: 570

Answers (3)

Example how ingestion time works!

Let’s say you have event table with field .create table events (source: string)

And orders table with fields .create table orders (source: string, LastModifiedDate: datetime)

Now you are copying events table source to orders table but, you want to generate your last modified date on orders table as ingestion time.

Steps:

  1. Create events table - .create table events (source: string)

  2. Create orders table - .create table orders (source: string, LastModifiedDate: datetime)

  3. Create function to extend lastmodified in events table .create function eventRecordsExpand() { events | extends LastModifiedDate = ingestiontime()}

  4. Alter your orders table to update policy .alter table orders policy update @‘[{“source”: “events”, “query”: “eventRecordsExpand()“, “isEnabled”: “True”}]’

This will ingest records from events table to orders table with LastModifiedDate from orders table as ingestion time.

Upvotes: 0

Yoni L.
Yoni L.

Reputation: 25895

you can implement that kind of filtering logic as part of an update policy: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/management/updatepolicy

  1. ingest your raw data into SourceTable.
  2. create TargetTable, which has the exact same schema as SourceTable.
  3. set the update policy on TargetTable to have SourceTable as its source table, and define the filtering logic as part of the Query property of the policy (you can use a stored function here too)
  4. configure SourceTable to have a "zero" (00:00:00) soft delete period as part of its retention policy, so that the raw data is never made queryable and isn't retained.

Upvotes: 2

Avnera
Avnera

Reputation: 7608

Yes, you can do it with Update policy. The update policy can have a condition like this:

T
| where  EventEnqueuedUtcTime - timestamp < x minutes

Upvotes: 3

Related Questions