Reputation: 153
After running PySpark job long enough, I encounter error of "Task Lease Expired"; then I tried to re-submit the job, it gives "Task not acquired" and log field is empty.
What would be the reason or how should I diagnose this issue?
1 Master node: n1-standard-4 (4 vCPUs, 15 GB memory)
4 Worker nodes: n1-standard-1 (1 vCPU, 3.75 GB memory)
Edit:
The cluster appears to be healthy on the GCP console, but it wouldn't "acquire" any job any more. I have to recreate new clusters to run the same job, which seems Ok so far.
Upvotes: 3
Views: 4665
Reputation: 2099
This is too old. My answer had would be:
Check your cluster health in the YARN UI rather than using the GCP console. Something wrong should appear, for example workers not available.
If YARN UI is ok and you submitted the job by gcloud, it may that some internal process in GCP would be lost, so you could try restarting first. If it doesn't help recreate it is the option as you mention.
Upvotes: 1