PorgtheEaten
PorgtheEaten

Reputation: 194

Join when a date is within a date range ( Kusto / KQL / Azure Data Explorer )

In KQL, how can you add criteria for a join? For example, the query below shows a join. I only want to join rows when the 'code' column is equal and when 'date' is between StartDate and EndDate. I know this is possible in SQL but have not seen a working example in KQL.

Please keep in mind that the example below is not the actual dataset. I know I can achieve the desired result by filtering post join; however, this is not possible with the actual dataset due to the size of table B. When I try to filter post join, I run out of memory (runaway query).

let A = datatable(Code:string, StartDate:string, EndDate:string, Rate:string)[
'EUR', '2020-01-01', '2020-01-15', '1.5',
'EUR', '2020-01-15', '2020-01-31', '1.7',
'YEN', '2020-01-01', '2020-01-15', '2.1',
'YEN', '2020-01-15', '2020-01-31', '1.9'];
let B = datatable(Key:string, Code:string, Date:string)[
'1', 'EUR', '2020-01-02', 
'2', 'EUR', '2020-01-16', 
'3', 'EUR', '2020-01-13', 
'4', 'EUR', '2020-01-27', 
'5', 'YEN', '2020-01-03', 
'6', 'YEN', '2020-01-03', ];
A | join kind=inner B on Code 
| project-away *1, StartDate, EndDate
| summarize by Key, Date, Code, Rate

Current Result

Key Date        Code    Rate
1   2020-01-02  EUR     1.7
1   2020-01-02  EUR     1.5
2   2020-01-16  EUR     1.7
2   2020-01-16  EUR     1.5
3   2020-01-13  EUR     1.7
3   2020-01-13  EUR     1.5
4   2020-01-27  EUR     1.7
4   2020-01-27  EUR     1.5
5   2020-01-03  YEN     1.9
5   2020-01-03  YEN     2.1
6   2020-01-03  YEN     1.9
6   2020-01-03  YEN     2.1

Expected Result

Key Date        Code    Rate
1   2020-01-02  EUR     1.5
2   2020-01-16  EUR     1.7
3   2020-01-13  EUR     1.5
4   2020-01-27  EUR     1.7
5   2020-01-03  YEN     2.1
6   2020-01-03  YEN     2.1

Upvotes: 5

Views: 3546

Answers (1)

Avnera
Avnera

Reputation: 7608

You can add a filter after the join and use the Time window join technique

let lookupWindow = 1d;
let A = datatable(Code:string, StartDate:datetime , EndDate:datetime , Rate:string)[
'EUR', '2020-01-01', '2020-01-15', '1.5',
'EUR', '2020-01-15', '2020-01-31', '1.7',
'YEN', '2020-01-01', '2020-01-15', '2.1',
'YEN', '2020-01-15', '2020-01-31', '1.9'];
let B = datatable(Key:string, Code:string, Date:datetime )[
'1', 'EUR', '2020-01-02', 
'2', 'EUR', '2020-01-16', 
'3', 'EUR', '2020-01-13', 
'4', 'EUR', '2020-01-27', 
'5', 'YEN', '2020-01-03', 
'6', 'YEN', '2020-01-03', ];
A 
| extend TimeKey = range(bin(StartDate, lookupWindow), bin(EndDate,lookupWindow), lookupWindow)
| mv-expand TimeKey to typeof(datetime)
| join kind=inner (B 
    | extend TimeKey = bin(Date, 1d)
) on Code,TimeKey
| where Date between (StartDate .. EndDate)
| project-away  StartDate, EndDate, TimeKey*, Code1
Code Rate Key Date
EUR 1.5 1 2020-01-02 00:00:00.0000000
EUR 1.5 3 2020-01-13 00:00:00.0000000
EUR 1.7 2 2020-01-16 00:00:00.0000000
EUR 1.7 4 2020-01-27 00:00:00.0000000
YEN 2.1 5 2020-01-03 00:00:00.0000000
YEN 2.1 6 2020-01-03 00:00:00.0000000

Upvotes: 3

Related Questions