Reputation: 29
Can Apache NIFI "ExecuteSQL Processor" stream large set of select result in chunks say 'x' MB?
Upvotes: 2
Views: 883
Reputation: 1483
You can also specify a limit statement in the SQL itself (along with a sort by ID), pull one batch, get the last ID, pull all >max(id), repeat until done, i.e.
Start
|
UpdateAttr: maxid--------- SQL ... $maxid:isEmpty():ifElse('', 'where id>maxid') order by id limit n
|_____________________________|
|
do sth
it's by # records and not size - but knowing the approx size per record, you can still do it
Upvotes: 0
Reputation: 2972
You can now use the QueryDatabaseTable processor which supports chunking by the "Max Rows Per Flow File" attribute.
Upvotes: 0
Reputation: 191
The ExecuteSQL Processor can "stream" large numbers of rows in the sense that it will stream the data directly to FlowFile content (which will not be held in-memory/heap) so it is very memory efficient. It does not, at this time, chunk the results, though. There is a ticket https://issues.apache.org/jira/browse/NIFI-1251 to provide such capabilities, though.
Upvotes: 6