Charles
Charles

Reputation: 590

MapReduce jobs running forever

I'm using Michael Manoochehri's example (http://stackoverflow.com/a/10969900/1387380) to pipe data out of DataStore to Google Cloud Storage using Pipeline and Mapreduce APIs but my jobs are running forever and never complete. I have some jobs running for the past 7 days which I can't even stop from the MapperPipeline console interface.

How can I stop them manually or programmatically?

Upvotes: 1

Views: 1070

Answers (1)

Michael Manoochehri
Michael Manoochehri

Reputation: 7897

I think that this behavior is due to a bug in how the current version of the App Engine MapReduce lib handles Cloud Storage output writer errors. If this happens, as I mention above, check out the GAE logs for permission or API errors involving Cloud storage (or whichever output writer you are currently using).

There should be improvements in our next iteration of the library, but currently if there are issues like this, the quick workaround is to purge your task queue, correct problem causing the errors, and kick off the pipeline again.

Upvotes: 1

Related Questions