Reputation: 578
I am trying to add GPU resources to nomad and I am having the following error
my job description is
job "test" {
datacenters = ["dc1"]
group "echo" {
count = 1
task "server" {
driver = "docker"
config {
image = "hashicorp/http-echo:latest"
}
resources {
device "gpu" {
count = 1
}
}
}
}
}
and nomad I can't recognize the device block, everything works when I remove that
I am getting
Constraint missing devices filtered 1 node
nomad 1.3.1
Upvotes: 1
Views: 2898
Reputation: 519
This means the driver you are using to run the job is not present/not running on the machine. if you are deploying container application docker should be start first then the Job.
Upvotes: 0
Reputation: 51
During fingerprinting, a device plugin reports the number of detected devices, general information about each device (vendor, type, and model), and device-specific attributes (e.g., available memory, hardware features). The information returned by the plugin passes from the client to the server and is made available for use in scheduling jobs, using the device stanza in the task’s resource stanza, for example:
resources {
device "vendor/type/model" {
count = 2
constraint { ... }
affinity { ... }
}
}
device Parameters name (string: "") - Specifies the device required. The following inputs are valid:
<device_type>: If a single value is given, it is assumed to be the device type, such as "gpu", or "fpga".
/<device_type>: If two values are given separated by a /, the given device type will be selected, constraining on the provided vendor. Examples include "nvidia/gpu" or "amd/gpu".
/<device_type>/: If three values are given separated by a /, the given device type will be selected, constraining on the provided vendor, and model name. Examples include "nvidia/gpu/1080ti" or "nvidia/gpu/2080ti".
count (int: 1) - Specifies the number of instances of the given device that are required.
constraint (Constraint: nil) - Constraints to restrict which devices are eligible. This can be provided multiple times to define additional constraints. See below for available attributes.
affinity (Affinity: nil) - Affinity to specify a preference for which devices get selected. This can be provided multiple times to define additional affinities. See below for available attributes.
The following example job only show the device stanzas replace in own job:
resources {
device "nvidia/gpu" {
count = 1
}
check source here
Upvotes: 0