Reputation: 2514
If you have access to the spark history server, you can view running/completed spark jobs. On the SQL tab you can view the entire job including the entire execution plan(s). It certainly seems that I could thereby have access to code/buisness logic that I may be should not have access to. I am wondering if one can either (i) access the entire spark job (i.e. the code to run it) or programatically reconstruct the spark job (so that I could run it myself)
I realize that data access is a separate issue.
Upvotes: 0
Views: 81
Reputation: 7028
You can not reconstruct the exact code that was executed, just by looking at the execution plan. The execution plan got created by Sparks Catalyst Engine, that takes your code and optimizes it. Multiple scripts can lead to the same execution plan. But you could potentially reverse engineer a script that does essentially the same.
Upvotes: 1