Database Text Column Retrieval Speed VS Referenced File on Disk

Question

What's faster for getting large chunks of text: pulling from a database, or just pulling a reference ID and then finding that ID's text file on disk? Both need to read off the disk...

I have a database with a LOT of text columns, most contain short strings, but one field in particular requires LONGTEXT formatting, and I regularly need to group these together and pull the text out of the database for additional processing to get stats, etc. about the text. Currently, this operation takes quite a bit of time, even with proper indexing and ORDER BY NULL being thrown in to stop sorting from happening. I've built a parallel processing architecture for PHP, so this task, if done from the file system, could be performed with several worker scripts running to speed things up even more.

O. Jones · Accepted Answer

Your answer: Files on a content server.

Why?

you can exploit your php parallel processing setup.
(When you use large text strings in your DBMS for this you put much of the workload on your MySQL server, your single scarcest resource. That's a formula for creating bottlenecks.)
You've probably already discovered that the php code to fetch result rows with LONGTEXT items is a pain in the neck. By comparison, reading files is easy. Easy is good. Easy usually means reliable.
Do you have to deliver these files to web browsers ever? If so, your web server (apache or nginx) can do that. The only thing your php code needs is a way to generate the correct hyperlink.
This is all proven to work at large scale by web properties that handle millions upon millions of images and/or other media files. Plus, WordPress does it this way, which means it's very widely deployed.

Database Text Column Retrieval Speed VS Referenced File on Disk

Answers (1)

Related Questions