galix85
galix85

Reputation: 167

HandleHttpRequest Failed with SERVICE_UNAVAILABLE when "rate of the dataflow is exceeding the provenance recording rate"

I have a batch test using jMeter who sends several HTTP requests (GET) to processor NIFI using HandleHttpRequest and sends to Topic Kafka.

The problem is StandardHTTPContextMap returns SERVICE_UNAVAILABLE error, it seems this happens when rate of the dataflow is exceeding the provenance recording rate but i'm not sure.

Anyone have any idea? I drop a partial log:

2016-05-05 15:12:14,064 WARN [Timer-Driven Process Thread-7] 
o.a.n.p.PersistentProvenanceRepository The rate of the dataflow is exceeding the provenance recording rate. Slowing down flow to accommodate. Currently, there are 96 journal files (533328812 bytes) and threshold for blocking is 80 (1181116006 bytes)

2016-05-05 15:12:20,310 INFO [Provenance Repository Rollover Thread-2] 
o.a.n.p.PersistentProvenanceRepository Successfully merged 16 journal files (46096 records) into single Provenance Log File ./provenance_repository/8913710.prov in 43254 milliseconds

2016-05-05 15:12:20,314 INFO [Provenance Repository Rollover Thread-2] o.a.n.p.PersistentProvenanceRepository Successfully Rolled over Provenance Event file containing 65422 records

2016-05-05 15:12:20,398 INFO [Timer-Driven Process Thread-7]   
o.a.n.p.PersistentProvenanceRepository Provenance Repository has now caught up with rolling over journal files. Current number of journal files to be rolled over is 80

2016-05-05 15:12:20,399 INFO [Timer-Driven Process Thread-7] 
o.a.n.p.PersistentProvenanceRepository Created new Provenance Event Writers for events starting with ID 9190418

2016-05-05 15:12:21,422 INFO [qtp1693512967-121] 
o.a.n.p.standard.HandleHttpRequest HandleHttpRequest[id=3858f0ad-b165-427b-a460-67fbf7cff0d8] Sending back a SERVICE_UNAVAILABLE response to 172.26.60.27; request was GET 172.26.60.27

Upvotes: 1

Views: 1554

Answers (1)

JDP10101
JDP10101

Reputation: 1852

You are correct in your analysis that the HTTP response you see is coming from the HttpContextMap[1]. Specifically the 'Request Expiration' property. When the request has been in the Map for over the configured amount it will automatically reply with SERVICE_UNAVAILABLE.

My guess at your problem is that NiFi is taking too long too to process all the requests you are submitting, causing the Provenance Repo to force roll over, which is a "Stop the world" event. So you stopped processing any data for 6 seconds (causing the requests to expire).

Assuming you don't want to just accept random 6 second "Stop the world" events and without knowing anything about your flow or configuration, you essentially need to either scale or adjust your flow. A couple of options being:

  • Scale to bigger nodes or more nodes
  • Process larger FlowFiles instead of many FlowFiles (helps a lot to speed up Provenance)
  • Push provenance repo to its own disk/Push it to many disks

[1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.http.StandardHttpContextMap/index.html

Upvotes: 4

Related Questions