irrelevantUser
irrelevantUser

Reputation: 1322

Apache NIFI: Recovering from Flowfile repository issue

I am currently trying to recover my flows from the below exception.

failed to process session due to Cannot update journal file /data/disk1/nifi/flowfile_repository/journals/90620570.journal because no header has been written yet.; Processor Administratively Yielded for 1 sec: java.lang.IllegalStateException: Cannot update journal file /data/disk1/nifi/flowfile_repository/journals/90620570.journal because no header has been written yet.

I have seen some answers on best practices wrt to handling large files in Nifi, but my question is more about how to recover from this exception. My observation is that, once the exception is seen, it begins to appear in several processors in all the flows in our nifi instance, how do we recover without a restart?

Upvotes: 8

Views: 7546

Answers (2)

Sarah Messer
Sarah Messer

Reputation: 4083

Was running into a similar problem / combination of problems (same error message as OP, processing a series of large input files). We had to do frequent restarts and occasionally would also get messages like: { "cause0":"java.lang.OutOfMemoryError: Java heap space", "message":"java.lang.OutOfMemoryError: Java heap space", "url":"/nifi-api/flow/process-groups/0186103d-1e3a-1c62-2e53-1f3b8e52b416", "status":"500" }

Turns out we were using half the recommended number of CPUs, 4 instead of 8. So we doubled those. The JVM memory was also low. Followed the advice on this site to increase memory allocated to the JVM:

# JVM memory settings
java.arg.2=-Xms2048m
java.arg.3=-Xmx2048m

The JVM changes seem like they were more effective than the CPU increase, but maybe the system needed both. In any case, after those changes, mean time between failures jumped from <~ 1 hour to 1-2 days... and the system can now reset itself rather than requiring manual intervention, so real mean-time-to-failure is probably a lot longer. (We didn't add anything specific to automate resets; NiFi just figures it out on its own.)

Upvotes: 0

mythic
mythic

Reputation: 635

It seems like your disk is full which is not allowing the processors to update or modify the data.

You can either increase your disk or you can delete the contents from your nifi repository.

first, check the logs folder. If its the logs folder thats taking up the space, you can directly do a

rm -rf logs/*

else just delete all the content

rm -rf logs/* content_repository/* provenance_repository/* flowfile_repository/* database_repository/*

PS : The deletion of the content will cause all your data on the canvas also to be deleted, so make sure you're not deleting the data which can't be reproduced.

Most likely, it must be the logs which must be eating up the space. Also, check your log rotation interval!

Let me know if you need further assistance!

Upvotes: 6

Related Questions