Rabah DevOps
Rabah DevOps

Reputation: 87

Nomad and Waypoint cannot launch more than 2 jobs

I'am currently tryng to deploy several db on Nomad cluster. Test - dev - qa - ppd I'am using waypoint with var files to automatise deploy. I have strange issue, I cannot launch more than 2 db job, when I launch new db job the older 2 jobs disappear and replaced by new db job launched previously.

Waypoint file

# waypoint up -var-file=/opt/waypoint/xx/xx-api/dev/dev.wpvars
project = "xx-db"

# An application to deploy.
app "xx-db" {
    build {
        use "docker" {
            dockerfile = "${path.app}/${var.dockerfile_path}"
        }
        
        
        # Uncomment below to use a remote docker registry to push your built images.
        #
         registry {
           use "docker" {
             #image = "registry.example.com/image"
             image =  "${var.registry_path}/xx-db-${var.env}"
             tag   = "${var.version}"
           }
         }

    }



 # Deploy to Docker
    deploy {
           use "nomad-jobspec" {
      jobspec = templatefile("${path.app}/finess-db.hcl", {
    datacenter = var.datacenter
  env = var.env

          })
    }
    }
}




variable env {
    type = string
    default = ""
}

variable dockerfile_path {
    type = string
    default = "Dockerfile"
}

variable "registry_path" {
    type = string
    default = "registry.repo.proxy-xx-xx.xx.xx.xx.net"
               
}

variable datacenter {
    type = string
    default = "xx"
}

variable "version" {
  type    = string
  default = gitrefpretty()
  env     = ["gitrefpretty()"]
               
}

2 job

After new job the older test and formation disapper new job

job "xxx-psqldb-${env}" {
        datacenters = ["xxx"]
        type = "service"
          vault {
          policies = ["xxx"]
          change_mode = "noop"
          }
        update {
                stagger = "30s"
                max_parallel = 1
        }

        group "xxx-psqldb-${env}" {
                count = "1"
                restart {
                        attempts = 3
                        delay = "60s"
                        interval = "1h"
                        mode = "fail"
                }
                network {
                        mode = "host"
                        port "pgsqldb" { to = 5432 }
                }
                task "xxx-psqldb-${env}" {
                        driver = "docker"
                        config {
                                image = "${artifact.image}:${artifact.tag}"
                                ports = [
                                        "pgsqldb"
                                        ]
                                volumes = [
                                    "name=xxxpsqldb${env},io_priority=high,size=5,repl=1:/var/lib/postgresql/data"
                                ]
                                volume_driver = "pxd"

                        }
                        template {
                                data = <<EOH
POSTGRES_USER="{{ with secret "app/xxx/db/admin" }}{{ .Data.data.user }}{{end}}"
POSTGRES_PASSWORD="{{ with secret "app/xxx/db/admin" }}{{ .Data.data.password }}{{end}}"

EOH
                                destination = "secrets/db"
                                env = true
                        }
                        resources {
                                cpu = 256
                                memory = 256
                        }
                        service {
                                name = "xxx-psql-svc-${env}"
                                tags = ["urlprefix-xxx-psql-${env} proto=tcp"]
                                port = "pgsqldb"
                                 check {
                                         name         = "alive"
                                         type         = "tcp"
                                         interval     = "10s"
                                         timeout      = "5s"
                                         port         = "pgsqldb"
                                }

                        }

                }
        }
}

I have this same issue when I launch other job for front app or back app.

Should I configure something in the cluster ?

Thx for help

Upvotes: 0

Views: 115

Answers (2)

Onigiri
Onigiri

Reputation: 26

Ive encountered similar issue.

TLDR: use -prune=false

Explanation

As the waypoint docs mention:

... if -prune=false is not set, Waypoint may delete your job via "pruning" a previous version

Furthermore,that currently locks you to using CLI

CLI flags are the only way to customize this today

As described here

The issue can be also found on hashicorp disuss.

Upvotes: 1

Rabah DevOps
Rabah DevOps

Reputation: 87

Just delete files data.db and waypoint-restore.db.lock resolved issue.

Thx

Upvotes: 0

Related Questions