atbug
atbug

Reputation: 838

Monitor memory usage of each node in a slurm job

My slurm job uses several nodes, and I want to know the maximum memory usage of each node for a running job. What can I do?

Right now, I can ssh into each node and do free -h -s 30 > memory_usage, but I think there must be a better way to do this.

Upvotes: 0

Views: 925

Answers (1)

damienfrancois
damienfrancois

Reputation: 59072

The Slurm accounting will give you the maximum memory usage over time over all tasks directly. If that information is not sufficient, you can setup profiling following this documentaiton and you will receive from Slurm the full memory usage of each process as a time series for the duration of the job. You can then aggregate per node, find the maximum, etc.

Upvotes: 1

Related Questions