kauschan
kauschan

Reputation: 486

Kusto: ingest from a query

Hello all ,

I am currently trying to ingest data using a batch operation . I have my query written as such:

.set-or-append tableName with (folder = "rocky")<|
let _Scope = () {
    let N = 4;
    range p from 0 to N-1 step 1
    | partition by p
    {
        functionName((list_of_ids()
        | where hash(something, N) == toscalar(p)), datetime(2020-05-03))
        | extend
            batch_num = toscalar(p)
    }
};
union (_Scope())

I want to understand if this would run in parallel for each partition or run sequential ?. If parallel how can i optimize this better ? . Any help is much appreciated.

Upvotes: 1

Views: 1429

Answers (1)

Yoni L.
Yoni L.

Reputation: 25995

The partition operator (which you use in your function) allows you to provide hints to control concurrency:

https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/partitionoperator

enter image description here

Regardless, depending on what functionName() does (it's not mentioned in the original question), you could consider using the distributed option:

https://learn.microsoft.com/en-us/azure/data-explorer/kusto/management/data-ingestion/ingest-from-query

Setting the distributed flag to true is useful when the amount of data being produced by the query is large (exceeds 1GB of data) and the query doesn't require serialization (so that multiple nodes can produce output in parallel). When the query results are small it's not recommended to use this flag, as it might generate a lot of small data shards needlessly.

Upvotes: 1

Related Questions