Storage management for large scale systems
Because of the slow access time of disk storage, storage management is crucial to the performance of many large scale computer systems. This thesis studies performance issues in buffer cache management and disk layout management, two important components of storage management. The buffer cache stores popular disk pages in memory to speed up the access to them. Buffer cache management algorithms used in real systems often have many parameters that require careful hand-tuning to get good performance. A self-tuning algorithm is proposed to automatically tune the page cleaning activity in the buffer cache management algorithm by monitoring the I/O activities of the buffer cache. This algorithm achieves performance comparable to the best manually tuned system.The global data structure used by the buffer cache management algorithm is protected by a lock. Access to this lock can cause contention which can significantly reduce system throughput in multi-processor systems. Current solutions to eliminate lock contention decrease the hit ratio of the buffer cache, which causes poor performance when the system is I/O-bound. A new approach, called the multi-region cache, is proposed. This approach eliminates lock contention, maintains the hit ratio of the buffer cache, and incurs little overhead. Moreover, this approach can be applied to most buffer cache management algorithms.Disk layout management arranges the layout of pages on disks to improve the disk I/O efficiency. The typical disk layout approach, called Overwrite, is optimized for sequential I/Os from a single file. Interleaved writes from multiple users can significantly decrease system throughput in large scale systems using Overwrite. Although the Log-structured File System (LFS) is optimized for such workloads, its garbage collection overhead can be expensive. In modern and future disks, because of the much faster improvement of disk transfer bandwidth over disk positioning time, LFS performs much better than Overwrite in most workloads, unless the disk is close to full. A new disk layout approach, called HyLog, is proposed. HyLog achieves performance comparable to the best of existing disk layout approaches in most cases.
DegreeDoctor of Philosophy (Ph.D.)
SupervisorBunt, Rick B.
CommitteeSrinivasan, Raj; Neufeld, Eric; Lutfiyya, Hanan; Eager, Derek L.; Deters, Ralph
Copyright DateDecember 2004