From:Steve Adams
Date:18-Apr-2001 19:20
Subject:   SAME disk configuration

Some advocates of SAME include the log files, and some do not. My comment excluding the log files was in deference to those who do not.

Firstly, for those that are not familiar with it, the paper that you are talking about can be found on TechNet. Here is what is says about the log files ...

"It is generally more efficient and flexible to parallelise IO operations using parallel execution at the Oracle level than using small stripe widths at the storage level. However, online log file writes cannot be parallelised at the Oracle level. They must be parallelised at the storage system level. If an online log file is located on a single disk, then operations that make changes very rapidly such as parallel updates, parallel index creations, parallel loads, etc. may become bottlenecked on the log disk. Therefore the online log file should be spread across multiple disks using striping.

In general it is easiest and most efficient to stripe the logs across all the disks just like the data files. Sometimes people worry that placing the log files on the same disks as the data files will cause interference between data accesses and log writes. This is because the disk head may have to move to a new position when the log is written. As we discussed in the previous section, for relatively small seeks the rotational latency of the IO will dominate the seek time. So, if the log file is placed along with the other frequently accessed data on the outside half of the disk, interference will not be a significant problem. Striping across too few disks is a bigger problem in practice.

In a later section we will discuss the availability issues around striping the log files along with the data files.

The choice of stripe width for the log files is somewhat more tricky. Ideally we would like to stripe the log files using the same one megabyte stripe width as the rest of the files. However, the log files are written sequentially, and many storage systems limit the maximum size of a single write operation to one megabyte (or even less). If the maximum write size is limited, then using a one megabyte stripe width for the log files may not work well. In this case, a smaller stripe width such as 64K may work better.

Caching RAID controllers are an exception to this. If the storage subsystem can cache write operations in non-volatile RAM, then a one megabyte stripe width will work well for the log files. In this case, the write operation will be buffered in cache and the next log writes can be issued before the previous write is destaged to disk."

The suggestion that log file I/O can be parallelized with 64K or 1M striping is doubtful. Log file I/O operations are normally just a few K in size, and can only be parallelized by very fine grained striping. Further, as the Ixora tip on Disk Configuration for Online Log Files explains, because rotational latency should dominate the service time any striping actually degrades performance, although with 64K or 1M striping the negative effect will not be very noticeable. Anyway, the claim that there might be some benefit from such striping is plainly wrong.

Also, the suggestion that seek time due to I/O interference will not be a significant problem is crazy. The paper earlier gives specifications for a typical modern disk showing an average rotational latency of 3ms, and seek times between 1ms and 11ms depending on the distance seeked. So if the log files are striped together with other frequently accessed data on the outside half of the disks there will be a service time degradation of between 50% and 100%, and up to 200% if no such distinction is made between frequently and infrequently accessed data.

Anyway you look at it, striping the log files together with everything else makes no sense. Then again, SAME makes no sense for a data warehouse anyway as I argued in the item in Ixora News. In particular, SAME makes no allowance for how you might add extra disk capacity in future without either introducing a hot spot or undertaking a major disk reorganization.

We are planning to implement the SAME technology for our data warehouse on newly bought IBM ESS. I read your view on this new concept. I agree with your view where you can get much better performance if you carefully spread the load.

In the article you mention "stripe and mirror everything (except the online log files)". Do you see any reason why you are excluding the online log files because the paper suggests stripe the log files also along with other datafiles. Can you please give some input on this so that I can convince my users not to go with SAME technology or at least not to stripe the online logs with other data.