Amit
Amit

Reputation: 9

Spark Performance Monitoring

I have got a requirement to show the management/ Client that the executor-memory, number of cores, default parallelism, number of shuffle partitions and other configuration properties for running the spark job are not excessive or more than required. I need a monitoring (with visualization) tool by which I can justify the memory usage in the spark job. Additionally it should give the kind of information like memory is not getting used properly or certain job requires more memory.

Please suggest some application or tool.

Upvotes: 0

Views: 164

Answers (1)

Tagar
Tagar

Reputation: 14939

LinkedIn has created a tool that sounds very similar to what you're looking for

See for a presentation as an overview of that product https://youtu.be/7KjnjwgZN7A?t=480

LinkedIn team has open-sourced Dr. Elephant here - https://github.com/linkedin/dr-elephant

Give it a try. Notice that this setup may require manual tweaking of Spark History Server as part of initial integration setup to get the information that Dr. Elephant requires.

Upvotes: 1

Related Questions