Karan
Karan

Reputation: 1676

MySQL vs Web Server for processing data

I was wondering if it's faster to process data in MySQL or a server language like PHP or Python. I'm sure native functions like ORDER will be faster in MySQL due to indexing, caching, etc, but actually calculating the rank (including ties returning multiple entries as having the same rank):

Sample SQL

SELECT TORCH_ID,
    distance AS thisscore,
    (SELECT COUNT(distinct(distance))+1 FROM torch_info WHERE distance > thisscore) AS rank
    FROM torch_info ORDER BY rank

Server

...as opposed to just doing a SELECT TORCH_ID FROM torch_info ORDER BY score DESC and then figure out rank in PHP on the web server.

Upvotes: 4

Views: 1442

Answers (6)

Sasha Chedygov
Sasha Chedygov

Reputation: 130837

Edit: Since posting this, my answer has changed completely, partly due to the experience I've gained since then and partly because relational database systems have gotten significantly better since 2009. Today, 9 times out of 10, I would recommend doing as much of your data crunching in-database as possible. There are three reasons for this:

  1. Databases are highly optimized for crunching data—that's their entire job! With few exceptions, replicating what the database is doing at the application level is going to be slower unless you invest a lot of engineering effort into implementing the same optimizations that the DB provides to you for free—especially with a relatively slow language like PHP, Python, or Ruby.

  2. As the size of your table grows, pulling it into the application layer and operating on it there becomes prohibitively expensive simply due to the sheer amount of data transferred. Many applications will never reach this scale, but if you do, it's best to reduce the transfer overhead and keep the data operations as close to the DB as possible.

  3. In my experience, you're far more likely to introduce consistency bugs in your application than in your RDBMS, since the DB can enforce consistency on your data at a low level but the application cannot. If you don't have that safety net built-in, so you have to be more careful to not make mistakes.


Original answer: MySQL will probably be faster with most non-complex calculations. However, 90% of the time database server is the bottleneck, so do you really want to add to that by bogging down your database with these calculations? I myself would rather put them on the web/application server to even out the load, but that's your decision.

Upvotes: 3

Frank Farmer
Frank Farmer

Reputation: 39356

In general, the answer to the "Should I process data in the database, or on the web server question" is, "It depends".

  1. It's easy to add another web server. It's harder to add another database server. If you can take load off the database, that can be good.
  2. If the output of your data processing is much smaller than the required input, you may be able to avoid a lot of data transfer overhead by doing the processing in the database. As a simple example, it'd be foolish to SELECT *, retrieve every row in the table, and iterate through them on the web server to pick the one where x = 3, when you can just SELECT * WHERE x = 3
  3. As you pointed out, the database is optimized for operation on its data, using indexes, etc.

Upvotes: 2

djangofan
djangofan

Reputation: 29669

If your test is running individual queries instead of posting transactions then I would recommend using a JDBC driver over the ODBC dsn because youll get 2-3 times faster performance. (im assuming your using an odbc dsn here in your tests)

Upvotes: 0

Markus
Markus

Reputation: 1557

A large part of your question is dependent on the primary keys and indexes you have set up. Assuming that torchID is indexed properly...

You will find that mySQL is faster than server side code.

Another consideration you might want to make is how often this SQL will be called. You may find it easier to create a rank column and update that as each track record comes in. This will result in a lot of minor hits to your database, versus a number of "heavier" hits to your database.

So let's say you have 10,000 records, 1000 users who hit this query once a day, and 100 users who put in a new track record each day. I'd rather have the DB doing 100 updates in which 10% of them hit every record (9,999) then have the ranking query get hit 1,000 times a day.

My two cents.

Upvotes: 0

tom
tom

Reputation:

Ranking is based on count, order. So if you can do those functions faster, then rank will obviously be faster.

Upvotes: 0

Pro777
Pro777

Reputation: 1704

The speed of the count is going to depend on which DB storage engine you are using and the size of the table. Though I suspect that nearly every count and rank done in mySQL would be faster than pulling that same data into PHP memory and doing the same operation.

Upvotes: 1

Related Questions