Oracle Internals Notes

Buffered I/O

Most file system I/O is buffered by the operating system in its file system buffer cache. The idea of buffering is that if a process attempts to read data that is already in the cache, then that data can be returned immediately without waiting for a physical I/O operation. This is called a cache hit. The opposite is called a cache miss. When a cache miss occurs, the data is read from disk and placed into the cache. Old data may have to be removed from the cache to make room for the new data. If so, buffers are reused according to a least recently used algorithm in an attempt to maximize the number of cache hits.

The file system buffer cache is also used for write operations. When a process writes data, the modified buffer goes into the cache. If the process has explicitly requested the synchronous completion of writes (synchronous writes) then the data is written to disk immediately and the process waits until the operation has completed. However, by default delayed writes are used. User processes do not wait for delayed writes to complete. The data is just copied into the buffer cache, and the operating system has a background task that periodically flushes delayed write buffers to disk. Delayed writes allow multiple changes to hot blocks to be combined into fewer physical writes, and they allow physical writes to be optimally ordered and grouped.

Delayed writes can be lost if a system failure occurs while some delayed writes are still pending. Some file systems support a write behind mount option that minimizes the delay before the flushing of delayed write buffers begins. This minimizes the risk of data loss, but reduces the benefit of delayed write caching. It also reduces the risk and severity of delayed write backlogs.

Because delayed writes involve a risk of data loss, Oracle never uses them. Oracle insists on the synchronous completion of writes for all buffered I/O to database files. This is done by using the O_DSYNC flag when opening database files. This means that the data itself must be written synchronously, but that delayed writes may be used for updates to the file access and modification times recorded by the file system.

Do not confuse delayed writes with asynchronous writes. User processes do not wait for the completion of either type of writes. But, they are notified or learn when asynchronous writes have been completed, whereas there is no notification of the completion of delayed writes. It is just assumed that delayed writes will be completed. Thus delayed writes involve a risk of data loss, but asynchronous writes do not.

Ixora Pty Ltd.   All rights reserved.
12-Oct-2007 22:22
Search   Questions   Feedback   Up   Home