gcpman
gcpman

Reputation: 129

drain of dataflow streaming job does not end

I was execute "drain" for streaming job with this command

gcloud alpha dataflow jobs --project=xxxxxx drain

but it does not end after three days! this is log of this streaming job.

21:14:36.000
http: TLS handshake error from 172.17.0.2:40277: EOF
21:14:36.000
http: TLS handshake error from 172.17.0.2:36255: EOF
21:14:36.000
Kubelet is healthy?: true
21:14:42.000
http: TLS handshake error from 172.17.0.2:55731: EOF
21:14:42.000
Kubelet is healthy?: true
21:14:47.000
http: TLS handshake error from 172.17.0.2:60835: EOF
21:14:47.000
Kubelet is healthy?: true
21:14:48.208
Memory is used/total/max = 71/207/1801 MB, GC last/max = 0.00/0.00 %, #pushbacks=0, gc thrashing=false
21:14:48.403
Memory is used/total/max = 454/852/1801 MB, GC last/max = 0.00/27.00 %, #pushbacks=0, gc thrashing=false
21:14:49.020
Memory is used/total/max = 38/117/1801 MB, GC last/max = 0.00/0.00 %, #pushbacks=0, gc thrashing=false
21:14:49.245
Memory is used/total/max = 457/1092/1801 MB, GC last/max = 0.00/21.00 %, #pushbacks=0, gc thrashing=false
21:15:06.000
Kubelet is healthy?: true
21:15:06.000
http: TLS handshake error from 172.17.0.2:36348: EOF
21:15:06.000
Kubelet is healthy?: true
21:15:06.000

I was cancel this job. but I'm concerned about data loss.

I want to use "drain" instead of "cancel" How can I drain streaming job ??

Upvotes: 2

Views: 638

Answers (1)

Sam McVeety
Sam McVeety

Reputation: 3214

Summarizing the comment thread: this was a defect in the managed Dataflow service, that has since been fixed by a release of the service. The underlying issue was a priority inversion that caused a deadlock in message handling.

Upvotes: 1

Related Questions