Reputation: 3270
I'm trying to use spark sql to recursively query over hierarchal dataset and identifying the parent root of the all the nested children.
I've tried using self-join but it only works for 1 level.
Any ideas or pointers ?
Thanks
Upvotes: 4
Views: 12502
Reputation: 79
You can use a Graphx-based solution to perform a recursive query (parent/child or hierarchical queries) . This is a functionality provided by many databases called Recursive Common Table Expressions (CTE) or Connect by SQL Clause
See this article for more information: https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/
Upvotes: 4