ggkmath
ggkmath

Reputation: 4246

approaches to speed up scientific computations on server accessible by internet users

I'm interested in any conventional wisdom how to approach the following problem. Note that I'm a hardware guy, so be careful using industry knowledge/terminology/acronyms.

I'm providing an online application that includes very complex math computations, such as fast-Fourier transforms, that involve nested for-loops and very large data arrays (1.6GB each). Users on the internet will access this application, enter some custom parameters, and submit a job that calls these math computations. To keep the user's wait to a minimum, and allow multiple independent sessions for multiple simultaneous users (each user having a separate thread), I'm wondering how I can speed up the math computations, which I anticipate will the a bottleneck.

I'm not so much looking for advice in how to structure the program (e.g. use integer data types whenever possible instead of floating, use smaller arrays, etc.), but rather I'm interested, once the program is complete, what can be done further to speed things up.

For example, how to ensure multiple cores in the CPU are automatically accessed based on demand? (is this done by default or do I need to manage the process somehow?

Or, how to do parallel processing (breaking for-loop up among multiple cores and/or machines)?

Any practical advice is greatly appreciated. I'm sure I'm not the first to need this, so I'm hoping there are industry best practice approaches available that scale with demand.

Thanks in advance!

Upvotes: 2

Views: 131

Answers (2)

Alexandre C.
Alexandre C.

Reputation: 56976

FFT methods are highly parallelizable. Especially in multi dimension.

Classical implementations are FFTW and Intel MKL.

One approach (depending on available hardware) is a pool of worker threads (or processes, depending on configuration).

At my job, we have much success with a pool of PCs and as-simple-as-possible data packets, which get queued, computed (in multicore) by one PC, and sent back to user.

Don't try to micro-optimize the math stuff, use instead one of the above libraries. Focus on designing the packets, queuing the computations (don't forget some kind of quota/priorities), making sure computed data is reliably sent back to the thread which has to do the joins on the packets.

Depending on the hardware (enormous SMP computer or PC farms), the problems are different.

(If you have the choice, go for PC farms.)

Edit: You may want to consider OpenMP to automatically paralellize loops. As for PC farms, they offer advantages over big calculators from a flexibility point of view: they scale well, they are not that expensive, and can be bought/sold/reused efficiently. Linux is probably a good choice, but it depends on which environment you're comfortable with.

Sadly, I must say there are no (to my knowledge) good libraries to distribute reliably and efficiently computational requests on PC farms. The problem is quite hard, since you must account for breakdowns, network communication, congestion, distributing processes...

Upvotes: 4

Jaydee
Jaydee

Reputation: 4158

You don't state what you setup is (java, php, .net Will you be hosting the system or will it be hosted somewhere) so this is just some off the cuff thoughts:

As far as I know most modern systems that you are likely to be using will spread jobs over the processor cores available.

Spreading the workload over a number of servers can be done relatively easily by load balancing http://www.loadbalancing.org/

You could also look at "cloud computing", where your application would be hosted by somebody like amazon and you pay for what you use (more or less)

http://aws.amazon.com/ec2/

Other providers are available.

I fairly sure if you provide more details you'll get some more specific answers.

Upvotes: 1

Related Questions