Reputation: 89
I'm participating in a Kaggle competition with 4 other people. We all met in a MOOC by edx.org.
Although we can code using the Apache Spark engine, we don't know how to set up a cluster and install the necessary software to run spark on it.
Ideally, we're looking for a free platform that allows us to focus on the programming.
Do you know any platform that is easy to use and, ideally, free? If there isn't one, can you tell us how to set up the necessary infrastructure to participate in the challenge?
Thank you very much in advance.
Upvotes: 0
Views: 182
Reputation: 2455
It's not that hard to start a standalone cluster on Linux or OS X using the bundled scripts, which could be sufficient if you can work with one node, or each contribute your development computers to a cluster (on the same LAN).
When you need to scale, AWS EMR is pretty simple.
For a little more money, Databricks offers Spark as a managed service. Which means you really don't have to think too much about running the cluster.
Upvotes: 2