Will Ayd
Will Ayd

Reputation: 7164

How to return a StructArray from Multiple Scalar Functions

I have a scenario where I am working with temporal data in Apache Arrow and am using compute functions to extract date/time components like so:

auto year = arrow::compute::CallFunction("year", {array});
auto month = arrow::compute::CallFunction("month", {array});
auto day = arrow::compute::CallFunction("day", {array});
...

While this works, I have to manage three separate Datums. I would ideally like to have one function that returns a StructArray containing year/month/day elements, which can also scale out to more detailed time components. Is there a simply way of registering such a function with the current API?

Upvotes: 1

Views: 157

Answers (1)

0x26res
0x26res

Reputation: 13902

Is there a simply way of registering such a function with the current API?

I don't think so, your use case looks too specific. On the other hand if you do that often you can implement something that would do it for you:


std::shared_ptr<arrow::Array> CallFunctions(std::vector<std::string> const& functions,
                                            std::vector<arrow::Datum> const& args) {

  std::vector<std::shared_ptr<arrow::Array>> results;
  for (std::string const& function : functions) {
    results.push_back(arrow::compute::CallFunction(function, args).ValueOrDie().make_array());
  }
  return arrow::StructArray::Make(results, functions).ValueOrDie();
}

void test()  {
   auto array = ....
   auto structArray = CallFunctions({"year", "month", "day"}, {array});

}

Upvotes: 1

Related Questions