amphibient
amphibient

Reputation: 31212

RDBMS caching vs disk I/O -- comparison across vendors

I know little about how leading RDBMSs go about retrieving data. So these questions may seem a bit rudimentary:

  1. Does each SELECT in commonly used RDBMSs such as Oracle, SQL Server, MySQL, PostgeSQL etc. always mean a trip to read the data from the disk or do they, to some extent allowable by the hardware, cache commonly requested data to avoid the expensive I/O operation?

  2. How do they determine which data segments to cache?

  3. How do they go about synchronizing the cache once an update of some of the cached data occurs by a different process?

  4. Is there a comparison matrix on how different RDBMSs cache frequently requested data?

Thanks

Upvotes: 1

Views: 176

Answers (2)

Jonathan Leffler
Jonathan Leffler

Reputation: 753465

The answers for Informix are pretty similar to those given for SQL Server:

  1. Reads and writes both use the cache if at all possible. If the page needed is not already in cache, an appropriate collection of I/O operations occurs (typically, evicting some page from cache, perhaps a dirty page that must be written before a new page can be read in, and then reading the new page where the old one was).
  2. There are various algorithms, but page size and usage are the key parts. There are LRU queues for each page size.
  3. The DBMS as a whole is an ensemble of processes that use a buffer pool in shared memory (and, where possible, direct disk I/O instead of going through the kernel cache), and uses various forms of locking (semaphores, spin-locks, mutexes, etc) to handle concurrency and synchronization. (On Windows, Informix uses a single process with multiple threads; on Unix, it uses multiple processes.)
  4. Probably not.

Upvotes: 1

usr
usr

Reputation: 171178

I'll answer for SQL Server:

  1. Reads are served from cache if possible. Else, an IO occurs.
  2. From what has been written and from what I observe, it is an LRU algorithm. I don't think this is documented anywhere. The LRU items are database pages of 8KB.
  3. SQL Server is the only process which has access to the database files. So no other process can cause modifications. Regarding concurrent transactions: Multiple transactions can modify the same page. Locking (mostly at row-level, sometimes page or table level) ensures that the transactions do not disturb each other.
  4. I don't know.

Upvotes: 2

Related Questions