pcbzmani
pcbzmani

Reputation: 27

Recursive search in Spark DataFrame

I have employee table, where employee id and supervisor is present. I want to find the hierarchy for the employee in five levels.

Example: Employee 1 is reported to 2, 2 reported to 4,4 reported to 17, 17 reported to 20. But we not able to find 20 supervisor so we kept the supervisor for 20 is 20 itself.

EmployeeID SupervisiorID
1 2
2 4
8 6
9 5
6 3
5 10
4 17
3 15
10 20
15 20
17 20
16 21
15 13
14 12
13 11

Excepted output

EmployeeID SupervisiorID_1 SupervisiorID_2 SupervisiorID_3 SupervisiorID_4 SupervisiorID_5
1 2 4 17 20 20
2 4 17 20 20 20
8 6 3 15 20 20
9 5 10 20 20 20
6 3 15 20 20 20
5 10 20 20 20 20
4 17 20 20 20 20
3 15 20 20 20 20
10 20 20 20 20 20
15 20 20 20 20 20
17 20 20 20 20 20
16 21 21 21 21 21
15 13 11 11 11 11
14 12 12 12 12 12
13 11 11 11 11 11

How can we achieve this in Spark using dataframe recursively.

Upvotes: 1

Views: 1464

Answers (2)

Ged
Ged

Reputation: 18013

Although this has been asked many times, someone here https://dwgeek.com/spark-sql-recursive-dataframe-pyspark-and-scala.html/ has solved this.

Upvotes: 0

Young
Young

Reputation: 584

If you only have 5 levels, than it is better to use 4 joins to do the job. In my point of view, spark doesn't support natively recursive solutions for such scenario. If you really want to do it in a recursive way, you may need to collect the data u need and do it on driver locally.

Upvotes: 1

Related Questions