RDBMS caching vs disk I/O -- comparison across vendors

Question

I know little about how leading RDBMSs go about retrieving data. So these questions may seem a bit rudimentary:

Does each SELECT in commonly used RDBMSs such as Oracle, SQL Server, MySQL, PostgeSQL etc. always mean a trip to read the data from the disk or do they, to some extent allowable by the hardware, cache commonly requested data to avoid the expensive I/O operation?
How do they determine which data segments to cache?
How do they go about synchronizing the cache once an update of some of the cached data occurs by a different process?
Is there a comparison matrix on how different RDBMSs cache frequently requested data?

Thanks

Jonathan Leffler · Accepted Answer

The answers for Informix are pretty similar to those given for SQL Server:

Reads and writes both use the cache if at all possible. If the page needed is not already in cache, an appropriate collection of I/O operations occurs (typically, evicting some page from cache, perhaps a dirty page that must be written before a new page can be read in, and then reading the new page where the old one was).
There are various algorithms, but page size and usage are the key parts. There are LRU queues for each page size.
The DBMS as a whole is an ensemble of processes that use a buffer pool in shared memory (and, where possible, direct disk I/O instead of going through the kernel cache), and uses various forms of locking (semaphores, spin-locks, mutexes, etc) to handle concurrency and synchronization. (On Windows, Informix uses a single process with multiple threads; on Unix, it uses multiple processes.)
Probably not.

Answers (2)