Reputation: 115
I am running my camel ftp to download around 10000 files from a remote linux server directory to local machine directory. I am getting an heap out of memory error as soon as it completes downloading around 2000 files. Based on the other forum threads , it has been suggested to use maxMessagesPerPoll. But if I set it to 1000, it downloads only 1000 files and stops. My code is pretty simple and is taken from camel ftp example
from("sftp://xxxxx:22//tmp/serverfolder/?stepwise=false&include=ABC*.txt}}&username=XXXX&password=XXXXX&maximumReconnectAttempts=0&delay=5s&maxMessagesPerPoll=1000")
.to(/tmp/localfolder/);
Upvotes: 0
Views: 2008
Reputation: 3870
I believe you are going down the right track talking with the guys from the forums. However, it looks like a piece of their message was missed / missing. Camel FTP is a "polling endpoint" this means that it will constantly call the endpoint on a loop. This isn't hard to configure at all and it will give you the ability to pull a few files, wait a bit and then pull more. Ideally this will allow you to keep up with files being placed into the directory so you won't normally get a 10,000 batch of files to move. You can move a few hundred every couple seconds.
Documentation:
http://camel.apache.org/polling-consumer.html
http://camel.apache.org/ftp.html
IMPORTANT NOTE FROM FTP DOCS: See File2 as all the options there also applies for this component.
http://camel.apache.org/file2.html
Information from the file2 documentation
Property Default Description
delay 500 Milliseconds before the next poll of the file/directory.
This simply means that you can pick up a few files every couple seconds or minutes instead of trying to move all 10k at once. Also remember when using this option you typically would never have a 10k build up unless you only started your camel route once in a while.
IMPORTANT NOTE 2: FTP Consumer does not support concurrency
Just keep this in mind. You won't be able to just add a bunch of threads to increase performance so its important to keep your component constantly going if you want to handle lots of files. Ideally constantly consuming them rather than a once a day batch load is preferred.
Upvotes: 0