Reputation: 11
ISSUE: MPIRUN hangs and does not display any error message even with I_MPI_DEBUG 100
example: tried with any IMB-* benchmarks or even simple task as display hostname.
mpirun -n 2 hostname it will just hang and never return any output or error.
Any idea what I may need to check or where to check for more info.
OS info: Rocky Linux release 8.5 (Green Obsidian)
MPI version: Intel(R) MPI Library for Linux* OS, Version 2019 Update 12 Copyright 2003-2021, Intel Corporation.
strace hangs at:
[pid 19786] sched_setaffinity(0, 8, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]) = 0
[pid 19786] nanosleep({tv_sec=0, tv_nsec=0}, 0x7ffc04672b50) = 0
[pid 19786] openat(AT_FDCWD, "/sys/devices/system/node/node0/cpulist", O_RDONLY) = 6
[pid 19786] fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
[pid 19786] fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
[pid 19786] lseek(6, 0, SEEK_SET) = 0
[pid 19786] lseek(6, 0, SEEK_SET) = 0
Upvotes: 1
Views: 1274
Reputation: 1
I had the very same issue with Rocky 8.6 on a server with a AMD EPYC 74F3 CPU. Unfortunately I don't know the root cause either, but a simple yum update
solved the issue for me.
Best regards,
Sebastian
Upvotes: 0