Reputation: 486
Hello all ,
I am currently trying to ingest data using a batch operation . I have my query written as such:
.set-or-append tableName with (folder = "rocky")<|
let _Scope = () {
let N = 4;
range p from 0 to N-1 step 1
| partition by p
{
functionName((list_of_ids()
| where hash(something, N) == toscalar(p)), datetime(2020-05-03))
| extend
batch_num = toscalar(p)
}
};
union (_Scope())
I want to understand if this would run in parallel for each partition or run sequential ?. If parallel how can i optimize this better ? . Any help is much appreciated.
Upvotes: 1
Views: 1429
Reputation: 25995
The partition
operator (which you use in your function) allows you to provide hint
s to control concurrency:
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/partitionoperator
Regardless, depending on what functionName()
does (it's not mentioned in the original question), you could consider using the distributed
option:
Setting the
distributed
flag to true is useful when the amount of data being produced by the query is large (exceeds 1GB of data) and the query doesn't require serialization (so that multiple nodes can produce output in parallel). When the query results are small it's not recommended to use this flag, as it might generate a lot of small data shards needlessly.
Upvotes: 1