Reputation: 191
I'd like to aggregate across a few properties using a Function. For example, I have a Function where a start and end date is input, and also a Schedule Object Type with "date", "shift_type", "department", and "hours worked" properties.
I'd like my output to be the sum of hours worked for each date/shift type/department combo.
Upvotes: 4
Views: 1230
Reputation: 37137
I don't think you can natively in functions, only if you materialize the data into your Functions driver and code the logic manually. However you could create a column at dataset level that you then index into ontology and query that.
In your pipeline (pyspark example)
df = df.withColumn("shift_id", F.concat_ws("-", "date", "shift_type", "department"))
then in your functions you can aggregate on the shift id:
Objects.search()
.employees()
.groupBy(e => e.shiftId.topValues())
.segmentBy(e => e.hoursWorked.topValues())
.sum()
Upvotes: 0
Reputation: 63
In the current Functions aggregations API, you can only create 2D and 3D aggregations directly from an ObjectSet via the groupBy
and segmentBy
functions.
If you want to aggregate on more than two properties (which would be a 4+D aggregation), you have two options:
Convert the ObjectSet to a list of Objects (via calling .allAsync()
), and then write TypeScript logic to convert that list into a data structure that aggregates over the object properties. Note that this may not perform well if you have a large number (thousands or more) of Objects in your Object Set.
Add a column to the Object (and the backing dataset) which is a composite key of the columns you want to group on. In your example, this could look like date.2022-01-01.shift.1200.department.emergency_room
. Then, in your Functions code you could do a groupBy
on this composite key. Next, you could convert this 2D aggregation into a multi-dimensional aggregation where you split the composite key into its individual parts.
Depending on where you want to use this aggregated data, there may be some additional steps required. Here are some examples:
If you have a Slate or custom application which calls the Function directly and handles the response on the frontend, then you could just return the aggregation as long as it conforms to the allowed Functions return types.
If you want to display this data in a table in Workshop (effectively as a Function-backed pivot table), then you will want to use an Object Table with Function-backed columns. You will need an Object which is at the desired level of granularity (where the primary key is the composite key from above, for example). This could be a very simple Object where the only property is this key (and maybe the components of the key, if that is useful for filtering purposes).
Upvotes: 2