Disk Configuration for
Sequential Scans

This tip is about disk configuration for single-threaded sequential scans only. To learn how to optimize parallel table scans, please see Disk Configuration for Parallel Table Scans. To optimize single-threaded sequential scans, you need to minimize disk head movement and maximize the transfer rate.

How do you minimize disk head movement?

Sequential reads occur mainly in connection with full table scans and fast full index scans. To obtain the benefit of sequential reads, namely only a few short seeks, there must not be any other concurrent access to the disks. If another process is using the same disks, then the disk heads will continually be moved away from the cylinder where the sequential reader is reading. This results in a significant increase in service times. Therefore, so far as possible, segments that are predominantly accessed by full scans should be separated so as to minimize the probability of concurrent access to the same disks.

It is also preferable to use raw datafiles or an extent based file system to avoid or minimize disk head movement due to file system allocation policies.

How do you maximize the transfer rate?

Striping can be used to improve the transfer rate for large reads. For example, consider performing multiblock reads of 256K from disks with a 64K average track length. If the data is all on one disk, then several full rotations of the disk will be needed to read the data. This is because only one disk head can transfer data at a time, despite that disks have multiple platters and disk heads. If, however, the data is striped evenly across four disks, then it can be retrieved in approximately one quarter of the time by the four disks working in parallel. This is illustrated in the following figure.

Striping improves the transfer rate for large I/Os

How many disks you can have working together in parallel to service a large read is probably limited more by your budget than by anything else. However, a law of diminishing returns also applies. This is particularly true if your disks do not have track buffers. In that case, there will be a rotational latency for every disk involved in servicing the read, and the probability of one disk requiring a very long rotational latency increases with the number of disks.

If your budget is severely limited, as is so often the case, you may have to choose between the following options.

Having no striping at all for sequential reads. This maintains the sequential nature of the I/O at the expense of improving the transfer rate.
Sharing striped disks with segments that will be accessed concurrently. This improves the transfer rate for multiblock reads, but sacrifices the sequential nature of the I/O.

One would expect that it would be better to have no striping at all for sequential reads, than to jeopardize the sequential nature of the I/O by sharing the disks with segments that are likely to be accessed concurrently. However, if your multiblock read size is larger than 128K, it is in fact better to opt for the improvement in the transfer rate, even if a stripe breadth of only two disks is possible.

To optimize the transfer rate from a set of striped disks, you must ensure that all of them are used, and used equally for each multiblock read. This is done by controlling the stripe element size. The stripe element size is the amount of logically consecutive data that is stored contiguously on a single disk. This is also sometimes called the stripe unit size, chunk size, or interlace size. The stripe breadth is the number of consecutive stripe elements that are stored on distinct disks. To ensure that all the disks in a stripe are used equally for multiblock reads, the stripe element size should be the multiblock read size divided by the stripe breadth, rounded up to a multiple of the database block size.

Because multiblock reads are not normally aligned to stripe elements, this stripe element size setting will result in two physical I/O requests being issued to one of the disks in the stripe. This is not of concern, because in almost all cases the two I/O requests will be serviced seamlessly.

Copyright Ixora Pty Ltd Send Email Home