how to reduce task kill period time when task state is TASK_LOST?

Question

I am working around with marathon & mesos & docker very well, but it recently discovered a problem.when mesos-slave encounter an Exception , the state of task on Marathon will change to TASK_LOST , and the task can not be killed only after about 15mins.

I did a test by manually Reboot My Operation System that run mesos-slave service and docker and run the task, and then the task state shown in Marathon UI became to " Unscheduled(100%) " ，and the task can not be killed automatically either manually, until past about 15 minutes. My question is how to reduce this time? I tried to add marathon startup command line args with

task_launch_confirm_timeout=30000
scale_apps_interval = 30000
task_lost_expunge_initial_delay = 30000
task_launch_timeout = 30000

and add mesos-slave startup command line args with

recovery_timeout=1mins

but it doesn't work for me.

how to reduce task kill period time when task state is TASK_LOST?

Answers (1)

Related Questions