Reputation: 105
Colleague and I were having a discussion on the architecture of an application we are building out in Weblogic. The gist of the application is this. Files are placed on a network drive, some processing is done, files go out. The files fall into categories called transactions. The debate is, whether it is best to have all the files of the different transactions come into one folder and have one inbound file adapter looking at the folder, or separate the folders by transactions and have one inbound file adapter per transaction.
The system can have a few hundred transactions so if it's the 1:1 ratio there would be hundreds of pollers. It may also be possible to group them but we'd still have probably 50+ directories.
Not all transactions have the same throughput requirements. Some would need to be picked up in near real time - some, just look at that folder once a day and pick them up. Some transactions could have tens of thousands of files per day.
From a high level, the first component obtains the filename from a directory, moves the file to the next folder and places a message on the queue alerting the next downstream component to work on the file.
Advantage for 1 directory:
Disadvantage:
Advantage for many directories:
Disadvantage: Many inbound adapters threads polling (though, not always actively).
My question to the community is - as far as Spring Integration, how terrible is it to have potentially hundreds of inbound file adapters started up in the app? What issues may arise? I assume when a file inbound adapter is not listing the directory it's pretty much idle and consumes no resources?
We are using Weblogic as the app server and my coworker also suggests using the Work Manager to manage thread resources in other parts of the system. Could that also be used to handle hundreds of inbound adapters?
Thanks!
Upvotes: 0
Views: 1122
Reputation: 174729
Pollers share a single task scheduler, the default pool has 10 threads but that can be increased. So that's not really an issue - and, yes, no resources are consumed between polls.
From a high level, the first component obtains the filename from a directory, moves the file to the next folder and places a message on the queue alerting the next downstream component to work on the file.
Since the poller does so little work (move the file and send a message to a queue) I don't think it will be a limiting factor to have a single instance (perhaps with a warm standby).
my colleague wants to build a monolith...while I want to breakout each component into its each deployable
I concur with your approach. Using middleware (JMS, RabbitMQ) to distribute the work gives you most flexibility, you can increase the consumer threads in each instance and add more instances as needed.
Upvotes: 2