Reputation: 12110
As stated in my question title, I am currently facing the problem of how to load balance an application that serves multiple services.
The application is a storage service that stores files of users organized in buckets. The files itself are not actually stored on the application server, but on network storage. The application servers are used to encrypt/decrypt the data and to offer several services that enable users to access their data. These services currently include FTP, SFTP, HTTP as well as JNDI/RMI for internal use and might be enhanced in the future other proprietary or own protocols.
Two file buckets might not be accessed with two servers at a time, so I would like to route ANY invocation of ANY service to the same cluster node if it's still running. If not, another server will open the connection to the bucket and serve it to the users.
How do you cluster such an application? I looked at both, Tomcat and JBoss AS cluster guides and read some articles on Java EE clustering but nothing could give me an idea of how to achieve my goal yet. I think one of my main problems is the load balancing and I'll probably not be able to use any standard solution here.
Upvotes: 0
Views: 364
Reputation: 37214
I'd be tempted to create a hash from the file name, and use that to ensure load is relatively well balanced to begin with.
For a simplified example (in C):
hash = 0
for(i = 0; i < strlen(file_name); i++) {
hash ^= (hash << 5) ^ file_name[i];
}
server_number_for_this_file = hash % total_servers;
For something like finding all files in a specific group, ask all servers and combine the replies. For e.g. the first server might return "hello" and "foo", and the second server might return "goodbye" and "bar", so you'd combine these partial lists to get a list of 4 files.
Note: I'd assume the application servers cache directory contents to avoid hassling the network storage all the time, so this improves cache efficiency too (as with 10 application servers, each application server only needs to cache 10% of the directory contents data instead of 100% of it).
Of course I'd also be tempted to do encryption/decryption on the client so that file data being transferred between the client and the application server (over an untrusted internet?) is encrypted, rather than having "plain text" over the untrusted network and only encrypting data that's transferred between the application server and the network storage (on a trusted network).
Upvotes: 1