Reputation: 3067
I have a apache spark full stack+ Apache zeppelin running on a machine with very little resources (512MB) which is crashing.
Spark Command: /usr/lib/jvm/java/bin/java -cp /home/ec2-user/spark-1.4.1-bin-hadoop2.6/sbin/../conf/:/home/ec2-user/spark-1.4.1-bin-hadoop2.6/lib/spark-assembly-1.4.1-hadoop2.6.0.jar:/home/ec2-user/spark-1.4.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/home/ec2-user/spark-1.4.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/home/ec2-user/spark-1.4.1-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar -Xms512m -Xmx512m -XX:MaxPermSize=256m org.apache.spark.deploy.master.Master --ip ip-172-31-24-107 --port 7077 --webui-port 8080
========================================
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000daaa0000, 357957632, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 357957632 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/jvm-17290/hs_error.log
I know this is a bad idea, but I don't have anywhere else to test it and would like to be able to learn some code in scala + apache spark...
Is there a way I can reduce the memory footprint on spark so I can do my tests?
thanks
Upvotes: 1
Views: 1128
Reputation: 1659
Apache zeppelin is a great tool, but I have seen the same thing, takes up a lot of RAM. You could use the command like, in the spark home folder, bin/spark-shell will give you a spark scala shell, but its not pretty and intuitive to use.
You can use Eclipse (scala IDE) or IntelliJ (has a scala plugin)for spark scala development, just need to add the jars with maven or sbt.
You can do your prototyping in the scala shell and copy and paste into the IDE.
Also check out https://github.com/andypetrella/spark-notebook, it takes a smaller RAM foot print. Spark by it self takes less, but zeppelin takes a lot of space, from what I have seen.
Also for scala notebook: https://github.com/alexarchambault/jupyter-scala, then you can add the spark jars to the env, create sparkContext object, and use it.
Upvotes: 5