Lisa Miskovsky
Lisa Miskovsky

Reputation: 926

Giving an unique cache key identifier for different SQL queries. Alternatives to md5?

I have a function like this:

function query($query)
{
    /* If query result is in cache already, pull it */
    if $cache->has(md5($query))
       return $cache->get(md5($query));

    /* Do the actual query here. mysqli_query() etc, whatever u imagine. */

    /* Set the cache with 10 minute expiration */
    $cache->set(md5($query), $queryOutput, array(10)); 
}

So basically, if I query SELECT * FROM USERS, it is cached automatically for 10 minutes.

I don't know if md5 is safe to rely on. For once, it creates 32 character string, which sounds a bit overkill. Second, md5 is known to give same character string as output on certain inputs. Is there any alternatives to md5 to identify an unique cache key? There is very little chance, that two completely different SQL queries may get the same md5 output and break some pages of the website, but it is still a chance I should predict right now and code accordingly.

One more thing is, I feel like such use of functions is considered bad practise. A new user may be inserted to USERS table after my cache, but SELECT * FROM USERS will still get the same md5 output, hence, ignore the newly inserted user. They may register with same nickname few times in 10 minute duration.

Should I pass a second parameter to my sensitive queries, like query($query, 'IGNORECACHE')? It doesn't sound logical to me. There will be too much things to keep in mind. How do you guys handle such issues?

I will appreciate it if you can reply my first question about md5 alternative for this case, and a short explanation of good use of SQL caching on my second question would be greatly appreciated.

Thank you.

Upvotes: 2

Views: 2363

Answers (2)

Kaivosukeltaja
Kaivosukeltaja

Reputation: 15735

You can somewhat safely use md5, as the odds for colliding are very low. Another solution that could fit your needs is naming the cache key by your query type and appending parameters with appropriate separators:

$result = $cache->get('userdata_'.$userId);

When used carelessly, the cache can give you lots of unexpected problems when data in the database changes but the old data remains in cache. Make sure you have a way to invalidate caches and do so for all related caches when you add/modify/delete data.

Upvotes: 2

Sjoerd
Sjoerd

Reputation: 75649

If you are worried that you get a hash collision (i.e. two queries with the same md5), simply use the query itself as a key:

if $cache->has($query)
    return $cache->get($query);
$cache->set($query, $queryOutput, array(10)); 

Alternatively, you can use sha1. It returns a longer string so the chance for collisions is lower.

Don't worry about storing 32 or 40 bytes as a cache key, this won't noticably influence the performance of your web application.

MySQL also has its own query cache. If you do the same query again, MySQL will get it from its cache. If you insert a user into the users table, MySQL will recognize that it can no longer use the cache, but this is not the case with your cache class.

Upvotes: 4

Related Questions