Reputation: 11
When I use Kusto Explorer to query a external table pointing to partitioned log files in Azure blob storage, I always get 0 results even though it seems to be set up correctly.
I have an Azure App Gateway's WAF logs being written to blob storage and I'd like to query the data. We have been using Log Analytics to do this, so the move to blob storage is part of an effort to see if we can store the data with a retention policy since we don't need to keep it for very long, but have it in a way that we can still query if we need to, even if it's not super convenient, since we're not looking at it all the time.
It seemed that the easiest way to do this was to set up an external table in Azure Data Explorer that pointed to the blobs, but I didn't want to set up a cloud cluster, so I followed the instructions for setting up the Kusto emulator and fired up the emulator image locally (Windows 11) using Docker Engine and installed Kusto Explorer, which I'm using to create the external table in the local cluster DB.
https://learn.microsoft.com/en-us/azure/data-explorer/kusto-emulator-install
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/tools/kusto-explorer
When I point the external table to a single file in blob storage, it works fine and it can read the rows out of the log JSON, but since the WAF logs are partitioned by year, month, day, and hour, it's not very useful since I can only see 1 hour's logs at a time this way. I tried to follow the documentation to set the table up with partition parameters and the path format properties, and when I create the table, I get the correct sample URIs and files preview, using validateNotEmpty=true, and it all looks good. Even .show external table T artifacts
shows me the expected files in the container. But querying on the table always gives 0 results.
Here's my table definition:
.create external table ExternalTable (timestamp:string, resourceId:string, operationName:string, category: string, properties: string)
kind=storage
partition by (Year: string, Month: string, Day: string, Hour: string)
pathformat=("resourceid=/SUBSCRIPTIONS/<guid>/RESOURCEGROUPS/RG/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/App_GW/y=" Year "/m=" Month "/d=" Day "/h=" Hour "/m=00")
dataformat=json
(
'https://xxxx.blob.core.windows.net/insights-logs-applicationgatewayfirewalllog;xxxxaccesskeyxxxx'
) with (sampleUris=true, filesPreview=true, validateNotEmpty=true)
Files Preview output:
Uri | Size | Partition |
---|---|---|
https://xxxx.blob.core.windows.net/insights-logs-applicationgatewayfirewalllog/resourceId=/SUBSCRIPTIONS/\<guid>/RESOURCEGROUPS/RG/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/App_GW/y=2024/m=04/d=05/h=14/m=00/PT1H.json | 7293014 | { "Year": "2024", "Month": "04", "Day": "05", "Hour": "14" } |
Query (returns 0 records):
external_table("ExternalTable")
Am I missing something?
Upvotes: 0
Views: 176