Storing and searching files in MySQL

Question

I am trying to store at least 500,000+ small "files" in a database (3 KB average size up to about 8~10 MB occasionally). This is to remove them from the file-system and to speed up searches/user-operations.

Meta Data (essentially filename, datetime-created, datetime-modified)
A LONGBLOB of the file-contents

Storing them in the database (MySQL) has been fine. The database stores that number of files and searching the meta data (string, datetime, datetime) is also quick with the relevant indexes.

Unfortunately but unsurprisingly any attempt to search within the LONGBLOBs is really slow. Within the LONGBLOBs here is how the data looks:

80% are "text files" (e.g. XML) and under 100 KB
15% are "text files" but are over 100 KB (up to 8~10 MB)
5% or less are binary files (which might get corrupted in a "text" container).

Would classifying this data as either text or unknown and then placing it in a separate LONGTEXT table provide performance improvements when doing operations like LIKE "%X%" (as opposed to LONGBLOB)?

Is there any other techniques I can do to improve performance when searching through BLOBs (in a very "grep" style)? The searches are typically short sequences of data held within the BLOB and there is likely few searches which get repeated (but searches are somewhat predictable, some data is more interesting than others).

ddinchev · Accepted Answer

Well, you better do a full-text index (which will be of a HUGE size on such amounts of data) and do a MATCH AGAINST queries in order to search efficiently. LIKE is painfully slow on huge amounts of text, this is well know and should be avoided.

http://dev.mysql.com/doc/refman/5.5/en//fulltext-search.html

You could also keep them in the FS and build yourself command line tools that you call from within your server-side language that actually do "GREP style" searching and return the list of file paths of those that match your "query", but I'm not sure if this will be efficient.

Storing and searching files in MySQL

Answers (1)

Related Questions