Reputation: 704
In the GCP I want to create a SLURM cluster. So I followed the guide here: https://cloud.google.com/blog/products/gcp/easy-hpc-clusters-on-gcp-with-slurm
All seems to work, until the step where I ssh into the controller oder login nodes. Logging in itself works:
~/slurm-slurm-17.11/contribs/gcp$ gcloud deployment-manager deployments create slurm --config slurm-cluster.yaml
~/slurm-slurm-17.11/contribs/gcp$ gcloud compute ssh controller --zone=us-east1-b
Last login: Tue Apr 30 09:50:50 2019 from 194....
SSSSSSS
SSSSSSSSS
SSSSSSSSS
SSSSSSSSS
SSSS SSSSSSS SSSS
SSSSSS SSSSSS
SSSSSS SSSSSSS SSSSSS
SSSS SSSSSSSSS SSSS
SSS SSSSSSSSS SSS
SSSSS SSSS SSSSSSSSS SSSS SSSSS
SSS SSSSSS SSSSSSSSS SSSSSS SSS
SSSSSS SSSSSSS SSSSSS
SSS SSSSSS SSSSSS SSS
SSSSS SSSS SSSSSSS SSSS SSSSS
S SSS SSSSSSSSS SSS S
SSS SSSS SSSSSSSSS SSSS SSS
S SSS SSSSSS SSSSSSSSS SSSSSS SSS S
SSSSS SSSSSS SSSSSSSSS SSSSSS SSSSS
S SSSSS SSSS SSSSSSS SSSS SSSSS S
S SSS SSS SSS SSS S
S S S S
SSS
SSS
SSS
SSS
SSSSSSSSSSSS SSS SSSS SSSS SSSSSSSSS SSSSSSSSSSSSSSSSSSSS
SSSSSSSSSSSSS SSS SSSS SSSS SSSSSSSSSS SSSSSSSSSSSSSSSSSSSSSS
SSSS SSS SSSS SSSS SSSS SSSS SSSS SSSS
SSSS SSS SSSS SSSS SSSS SSSS SSSS SSSS
SSSSSSSSSSSS SSS SSSS SSSS SSSS SSSS SSSS SSSS
SSSSSSSSSSSS SSS SSSS SSSS SSSS SSSS SSSS SSSS
SSSS SSS SSSS SSSS SSSS SSSS SSSS SSSS
SSSS SSS SSSS SSSS SSSS SSSS SSSS SSSS
SSSSSSSSSSSSS SSS SSSSSSSSSSSSSSS SSSS SSSS SSSS SSSS
SSSSSSSSSSSS SSS SSSSSSSSSSSSS SSSS SSSS SSSS SSSS
*** Slurm is currently being installed/configured in the background. ***
A terminal broadcast will announce when installation and configuration is
complete.
But the installation never finishes, even if I wait for hours or a day. How can this be solved, or how can one find out what the problem is?
Here is the file slurm-cluster.yaml:
# Copyright 2017 SchedMD LLC.
# Modified for use with the Slurm Resource Manager.
#
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# [START cluster_yaml]
imports:
- path: slurm.jinja
resources:
- name: slurm-cluster
type: slurm.jinja
properties:
cluster_name : google1
static_node_count : 2
max_node_count : 10
zone : us-east1-b
region : us-east1
cidr : 10.10.0.0/16
controller_machine_type : n1-standard-2
compute_machine_type : n1-standard-2
login_machine_type : n1-standard-1
slurm_version : 17.11.5
default_account : default
default_users : ********
munge_key : 80bc8a12336e6094ced0cb3b2cb1e9c315d6276350207fecd7c293d4623a87bdba11e6eb38a1856fb78ff8dbd027860600f7df0f0d2c5fd960b4f16a0d3fc567f1 >
# [END cluster_yaml]
Upvotes: 0
Views: 476
Reputation: 704
The solution is to use a SLURM version which is currently available on the SchedMD download page:
https://www.schedmd.com/downloads.php
And specify the parameter accordingly in the slurm-cluster.yaml file.
Upvotes: 1