Application performance is rooted in speed – speed in completing the read and write requests that your applications demand from your infrastructure. Storage is responsible for the speed of returning I/O (input/output) requests, and the method chosen to commit the writes and deliver the reads has a profound impact on application performance. A common method in today’s industry is to use SSDs for caching on traditional spinning disk storage, hybrid arrays or all-flash arrays. Most caching solutions have accelerated reads for applications, but the real question remains, “Which write is right?”
Software-Define Everything Patent-pending network and storage technology with compute, virtualization, and SaaS management in ONE enterprise cloud in a box. Witness the power of Ignite today. |
Let’s look at why write optimization affects your application performance so drastically. Write I/O implies that it is new data not written on your underlying storage. In traditional SAN storage, for example, writes are written directly on the underlying storage and then returned to the application. With applications that are constantly writing new data, primarily big database applications (SQL, etc.), traditional spinning disks can’t keep up. Caching on SSDs became a solution that allowed writes to be written locally and cached based on the frequency of the application demand; however, there are several methods to the write-cache’s relationship with the underlying storage that cause a huge difference in performance.
These are the 3 forms of I/O writing:
- Write-Around (around the cache)
- Write-Through (through the cache)
- Write-Back (from the cache)
All three forms have different benefits that are based primarily on the type of data being written: sequential vs. random. Sequential I/O is the most optimized by the underlying disk (files or video streams for example), while random I/Os are optimized by the cache. Most caching appliances don’t have the dynamic intelligence to change the form of writing technology based on the type of data. Let’s understand the difference between the three forms of I/O writing.
Write-Around
Write-around, also known as read-only caching mode, is beneficial purely to free up space to cache reads. Incoming I/O never hits the cache. I/Os are written directly to permanent storage without caching any data.
What could possibly be the benefit of the cache if it isn’t used? It helps reduce the cache being flooded with write I/O that will not subsequently be re-read, but has the disadvantage that a read request for recently written data will create a “cache miss” and have to be read from slower bulk storage and experience higher latency. If your application is transactional, as most mission critical applications are, then application speed will slow down and I/O queues will grow. Essentially the value of this mode would be for rare use cases because it is time-consuming, slow, and not performant.
Write-Through
This method is commonly used in caching and hybrid storage solutions today. Write-through is known as a read caching mode, meaning that all data is written to the cache and the underlying storage at the same time. The write is ONLY considered complete once it has been written to your storage. Sounds pretty safe actually…but there is a speed drawback.
Here’s the issue: Every write operation is done twice, in the cache and then in permanent storage. Before applications can proceed, the permanent storage must return the I/O commit back to the cache, then back to the applications. This method is commonly implemented for failure resiliency and to avoid implementing a failover or HA strategy with cache because data lives in both locations. However, Write-Through incurs latency as the I/O commit is determined by the speed of the permanent storage, which is no match for the speeds of CPU and networking. You’re only as fast as your slowest component, and Write-Through can critically hamstring application speed.
Write-Back
Write-Back improves system outcomes in terms of speed – because the system doesn’t have to wait for writes to go to underlying storage.
When data comes in to be written, Write-Back will put the data into the cache, send an “all done” message, and keep the data for write to the storage disk later.
This solves a lot of the latency problems, because the system doesn’t have to wait for those deep writes.
With the right support, Write-Back can be the best method for multi-stage caching. It helps when the cache has large amounts of memory (i.e. memory measured in terabytes, not gigabytes) in order to handle large volumes of activity. Sophisticated systems will also need more than one solid state drive, which can add cost. It’s critically important to consider scenarios like power failure or other situations where critical data can be lost. But with the right “cache protection,” Write-Back can really speed up an architecture with few down-sides. For example, Write-Back systems can make use of RAID or redundant designs to keep data safe.
Even more elaborate systems will help the cache and the SAN or underlying storage disk to work with each other on as “as-needed basis,” delegating writes to either the deep storage or the cache depending on the disk’s workload.
The design philosophy of Write-Back is one that reflects the problem-solving that today’s advanced data handling systems bring to big tasks. By creating a more complex architecture, and using a cache in a complex way, Write-Back destroys latency problems, and although it may require more overhead, it allows for better system growth, and fewer growing pains.