Reputation: 576
The official documentation and all sorts of books and articles repeat the recommendation that Spark in local mode should not be used for production purposes. Why not? Why is it a bad idea to run a Spark application on one machine for production purposes? Is it simply because Spark is designed for distributed computing and if you only have one machine there are much easier ways to proceed?
Upvotes: 3
Views: 1281
Reputation: 11
I agree that this is largely ignored in official documentation but there are actually some benefits of running Spark even in local mode (e.g. instead of pure python, scala etc). There's a great resource and benchmark with more details here.
In summary the main advantages:
Upvotes: 1
Reputation: 1492
Local mode in Apache Spark is intended for development and testing purposes, and should not be used in production because:
Therefore, it is recommended to use a cluster manager for production environments to ensure scalability, resource management, fault tolerance, and security.
Upvotes: 2
Reputation: 1
I have the same question. I am certainly not an authority on the subject, but because no-one has answered this question, I'll try to list the reasons I've encountered while using Spark local mode in Java. So far:
Upvotes: 0