Exploiting Peak Device Throughput from Random Access Workload


In this work, we propose a new batching scheme called temporal merge, which dispatches discontiguous block requests using a single I/O operation. It overcomes the disadvantages of narrow block interface and enables an OS to exploit peak throughput of a storage device for small random requests as well as a single large request. Temporal merge significantly enhances device and channel utilization regardless of access sequentiality of a workload, which has not been achievable by traditional schemes.
We extended the block I/O interface of a DRAM-based SSD in cooperation with its vendor, and implemented temporal merge into I/O subsystem in Linux 2.6.32. The experimental results show that under multi-threaded random access workload, the proposed solution can achieve 87%∼100% of peak throughput of the SSD. We expect that the new temporal merge interface will lead to better design of future host controller interfaces such as NVMHCI for next-generation storage devices.