shani klein
shani klein

Reputation: 344

KQL aggregation function product

I'm having the following table:

productName feature probability_for _feature
A w 0.2
A z 0.8
B w 0.2
B z 0.8
B x 0.3

I want to get for each productName the product of probability for it's feature meaning productName A have both feature w with probability 0.2 and feature z with probability 0.8 hence the product for feature A will bee 0.2*0.8= 0.16. Thus , in the table above I will get:

productName features probability_for _feature
A [w,z] 0.16
B [w,z,x] 0.048

Or just :

productName probability
A 0.16
B 0.048

I didn't find any product or multiply function as aggregation function and would like to get some help . Thank you !

Upvotes: 1

Views: 453

Answers (2)

Steffen Zeidler
Steffen Zeidler

Reputation: 291

For aggregation with any function you can use the scan operator.
scan operator example cumulative sum
Example with multiply:

datatable (productName: string, feature: string, probability_for_feature: double)
[
        'A', 'w', 0.2,
        'A', 'z', 0.8,
        'B', 'w', 0.2,
        'B', 'z', 0.8,
        'B', 'x', 0.3,
]
| sort by productName asc
| partition by productName
(
    // for every productName scan all rows
    scan declare (probability: double= 1.0) with 
    (
        // multiply probability for every row and return last result
        step s1 output=last: true => probability = probability_for_feature * s1.probability;
    )
)
| project productName, probability

Upvotes: 1

Yoni L.
Yoni L.

Reputation: 25895

Here's an option, using a cumulative sum and leveraging the fact that log(x1) + log(x2) … + log(xN) == log(x1 * x2 * … * xN):

datatable(productName:string, feature:string, probability_for_feature:double)
[
    'A', 'w', 0.2,
    'A', 'z', 0.8,
    'B', 'w', 0.2,
    'B', 'z', 0.8,
    'B', 'x', 0.3,
]
| order by productName asc 
| extend l = log10(probability_for_feature), rn = row_number()
| extend cumsum = row_cumsum(l, productName != prev(productName))
| summarize arg_max(rn, *), features = make_list(feature) by productName
| project productName, features, product = exp10(cumsum)
productName features product
A [
"w",
"z"
]
0.16
B [
"w",
"z",
"x"
]
0.048

Upvotes: 1

Related Questions