Mithril
Mithril

Reputation: 13738

How to remove orphaned tasks in Apache Mesos?

The problem maybe caused by Mesos and Marathon out of sync, but the solution mentioned on GitHub doesn't work for me.

When I found the orphaned tasks:

enter image description here

What I do is:

  1. restart Marathon

  2. Marathon does not sync orphaned tasks, but start new tasks.

  3. Orphaned tasks still took the resources, so I have to delete them.

  4. I find all orphaned tasks under framework ef169d8a-24fc-41d1-8b0d-c67718937a48-0000,

    curl -XGET `http://c196:5050/master/frameworks
    

    shows that framework is unregistered_frameworks:

    {
        "frameworks": [
            .....
        ],
        "completed_frameworks": [ ],
        "unregistered_frameworks": [
            "ef169d8a-24fc-41d1-8b0d-c67718937a48-0000",
            "ef169d8a-24fc-41d1-8b0d-c67718937a48-0000",
            "ef169d8a-24fc-41d1-8b0d-c67718937a48-0000"
        ]
    }
    
  5. Try to delete framework by framework ID (so that the tasks under framework would be delete too)

    curl -XPOST http://c196:5050/master/teardown -d 'frameworkId=ef169d8a-24fc-41d1-8b0d-c67718937a48-0000'
    

    but get No framework found with specified ID

So, how to delete orphaned tasks?

Upvotes: 4

Views: 4816

Answers (1)

janisz
janisz

Reputation: 6371

There are two options

  1. Register framework with same framework id. Do reconciliation and kill all tasks you receive. For example you can do it in following manner

    • Download the code git clone https://github.com/janisz/mesos-cookbook.git
    • Change dir cd mesos-cookbook/4_understanding_frameworks
    • In scheduler.go change master for your URL
    • If you want to mimic some other framework create /tmp/framework.json and fill it with FrameworkInfo data:

      {
        "id": "<mesos-framewokr-id>",
        "user": "<framework-user>",
        "name": "<framework-name>",
        "failover_timeout": 3600,
        "checkpoint": true,
        "hostname": "<hostname>",
        "webui_url": "<framework-web-ui>"
      }
      
    • Run it go run scheduler.go scheduler.pb.go mesos.pb.go

    • Get list of all tasks curl localhost:9090
    • Delete task with curl -X DELETE "http://10.10.10.10:9090/?id=task_id"
  2. Wait until failover_timeout so Mesos will delete this tasks for you.

Upvotes: 1

Related Questions