Reputation: 1284
I am not as huge a computer person as many others on here, I majored in math with MatLab as my main computer knowledge. I have recently got involved with Apache Spark through the excellent edX course offered by Berkeley.
The method that they used for setting up Spark was provided in a great step by step guide, it involved: downloading Oracle VM Virtual Box with an Ubuntu 32bit VM, then through the use of a vagrant (again I'm not hugely computer-y so not 100% sure how this worked or what it is) connect this to IPython notebook. This enabled me to have access to Spark over the internet and to code in python with pySpark, this is exactly what I want to do.
Everything was going very well until the second lab exercise, it became apparent that my Windows laptop has insufficient free memory (just 3 Gb and four years old) after it continually froze and crashed when trying to work with large datasets.
It is not possible to have a VM in a VM apparently so I have spent most of today looking for alternative ways of setting up Spark to no avail; the guides are all aimed at someone with more computer knowledge than I have.
My (likely naive) idea now is to rent an external machine that I can interface with through my windows laptop completely as before but so that the virtual machine operates outside of the memory of my laptop i.e. in the cloud (using any of Ubuntu, Windows, etc.). Essentially I want to move the Oracle VM virtual box to an outside source to rid my computer of memory burdens and to use Ipython notebook as before.
How can I set up a virtual machine to use for the computational side of Spark in Ipython notebook?
Or is there an alternate method that would be simple to follow?
Upvotes: 0
Views: 1004
Reputation: 27455
Don't run VMs. Instead:
bin/pyspark.cmd
.It's not an IPython Notebook, but you can run Python code against a local Spark instance.
If you want a beefier instance, do the same on a beefy remote machine. For example an EC2 m4.2xlarge
is $0.5 per hour with 8 cores and 30 GB of RAM.
Upvotes: 1