Reputation: 813
ListFile processor is not detecting any changes to a previously processed file and reprocess it. FYI, I have tried the following options already for reprocessing and only the finally mentioned hack is working. This is in a single-node NiFi I am running in my development environment.
touch -c
command changes the file timestamp but this does not cause auto-trigger of the ListFile processor either. .*test.*\.csv
to test.*\.csv
and vice versa later (i.e go back and forth like this for repeated reprocessing). Reprocessing of files with same old names and with modified data is a requirement for us. Please help!
And sometimes forced reprocessing of even an unmodified file could be required in case of unanticipated data issues upstream/downstream. Please help!
Still facing this sporadic behavior! Only restart of NiFi helps when the ListFile processor fails to respond to file change.
Upvotes: 2
Views: 4330
Reputation: 869
Probably this is delayed answer. The old List processors like ListFiles/ListFtp/ListSftp etc. used only timestamp tracking strategy to identify the changed files. The processor used to cache last seen timestamp in its processor state and use it to list files with only greater timestamp. However, this approach was very buggy. Hence they had to come up with much better strategy which is called Entity Tracking. This approach gives broad range of monitoring on file changes. It keeps track of below parameters of each file in the specified directory.
Any change in file is reflected in these key parameters. Since they are cached, any difference is treated as change, thus changed files appear in the success connection.
Upvotes: 5