Reputation: 1
I have a spark project running on 4 Core 16GB (both master/worker) instance, now can anyone tell me what are all the things to keep monitoring so that my cluster/jobs will never go down?
I have created a small list which includes the following items, please extend the list if you know more:
Upvotes: 0
Views: 646
Reputation: 1808
That's a good list. But in addition to those I would actually monitor the status of the receivers of the streaming application (assuming you are some non-HDFS source of data), whether they are connected or not. Well, to be honest, this was tricky to do with older versions of Spark Streaming as the instrumentation to get the receiver status didnt quite exist. However, with Spark 1.0 (to be released very soon), you can use the org.apache.spark.streaming.StreamingListener interface to get the events regarding the status of the receiver.
A sneak peak to the to-be-released Spark 1.0 docs is at http://people.apache.org/~tdas/spark-1.0.0-rc10-docs/streaming-programming-guide.html
Upvotes: 1