Original Article can be found here: http://www.smallnetbuilder.com
Breaking Down the Box
Since there are two separate problems – lack of bandwidth and the ability to use the available bandwidth, there are two different sets of optimizations included in any full-featured accelerator.
Compression: The value of compression is easy to understand – if a file is compressed by 2x, it’ll use half the bandwidth to get the file across the network in half the time. So useful is compression that many of the WAN acceleration vendors gained their start as developers of compression algorithms. Because compression can reduce the amount of bandwidth needed to support a site, it’s often the most effective component for demonstrating ROI on purchase of an accelerator.
Of course, the effectiveness of compression can vary widely depending on the traffic. A file consisting of nothing but the letter “A” repeated one billion times can be compressed down to just a few bytes, making for a very impressive demo.
But real data is more complicated, and so are the compression systems. Algorithms that works well on documents may be very different from ones that shrink already-compressed image files or streaming video, so most accelerators include multiple compression techniques and may adaptively adjust based on application.
However, compression is computationally very intensive, creating a trade-off between performance and the cost of building faster boxes or adding compression chips.
Caching : Like compression, caching reduces the amount of data sent over the WAN. The basic idea is simple – maintain a copy of everything sent over the WAN, and if a file has previously been sent, simply reference a local copy instead of sending it again. Of course, implementation can vary widely.
First, there’s the question of how much data to store. More is better, but requires larger hard drives and faster CPUs to search for matches in a bigger library, raising costs. The secret sauce is in the algorithms that decide what to keep and how long to keep it.
The second question is what blocks of data to cache. Traditionally, caching was done by individual file. If the exact same document or image was sent previously, a cached copy could be used instead. But our company has a dozen versions of the corporate PowerPoint presentation on the file server for different situations that differ by a couple slides, and I often find myself opening each before I find the one I need. Caching is now often done on smaller blocks of data, or by doing comparisons with cached files and transmitting a list of differences instead, but this requires considerably more powerful hardware.
The two graphs above and below show how much difference caching can make. For transfer of a 10 MB file over a 1 Mbps link, a Riverbed Steelhead system reduced the bandwidth used by 98%.
QoS / Prioritization: Caching and compression can only go so far. At some point there just isn’t enough bandwidth to support all the applications. When this happens, the best solution is to prioritize traffic, sending time sensitive packets such as VoIP before bulk data transfers and reserving minimum bandwidth levels for critical applications.
Again, this creates a trade-off between efficiency and complexity/cost. A system to reserve bandwidth to particular IP addresses or prioritize voice traffic over file transfers is easy to implement and simple to configure. A system that can automatically identify Skype connections or catch music downloads while prioritizing different applications into multi-tiered hierarchies can be better at guaranteeing performance of important applications, but obviously more expensive to build and requiring more effort to configure.
Inability to use available bandwidth: As we’ve seen, regardless of how much bandwidth is available, many protocols and applications can’t go at full speed when there is significant WAN latency or loss. Proxy techniques can overcome these limitations to allow applications to use all the bandwidth.
In the traditional WAN accelerator architecture, a device located between the server and the WAN pretends to be the end node and intercepts the connection. The accelerator then sends the data over the WAN in a way that’s less sensitive to latency and loss. Finally, a second accelerator on the opposite side pretends to be the original server to transmit the data to the client. The process is transparent to the network and can optimize the performance of almost any protocol or application.
TCP Acceleration: The sensitivity of TCP to latency and loss can be reduced in many ways, from simply increasing the amount of data to send at a time, to completely replacing TCP with a latency-optimized transport protocol.
CIFS Proxy: The accelerator can act as a CIFS client to collect the file and folder information from the server quickly, send it efficiently over the WAN (after accounting for cached objects and compressing the remainder), then act as a CIFS server to deliver the file to the end client. This can make it much faster to browse folders and download or open files.
HTTP Acceleration: Web pages may contain a hundred or more separate images and objects, each of which has to be fetched in a separate transfer. The HTTP protocol is not particularly efficient, and combined with TCP handshaking, can require a number of round trips back and forth to establish a connection before even sending any data, then more round trips at the end to terminate the connection. While some number of objects can be downloaded in parallel, web pages in far-away countries are often painfully sluggish to load. HTTP acceleration streamlines this process, not only improving how quickly Web pages load, but improving the responsiveness of the many corporate applications that use a browser-based user interface.
Other Application Proxies: Many other critical applications, from accounting packages to databases to virtual desktops were designed to run locally, and hit serious performance issues when there is too much latency or loss between client and server. Most of these can be improved with a proxy module that bundles up the data locally, sends it efficiently over the WAN, and delivers it locally to the client. However, each application is unique, so availability of acceleration modules for particular applications can vary widely between products.
Appliances, Clients, and Clouds
The traditional architecture for WAN acceleration is two appliances, a small one at the branch office and a large one at the central site. This is the best architecture for multi-national organizations, but not necessarily for small networks. Fortunately, acceleration products are becoming available in a wider variety of formats.
People working from home or on the road need a simple way to accelerate their connection back to the office. Many vendors now offer client software to install on individual PCs to connect to the central appliance. A small branch office can use the same software, installed on each PC, instead of an appliance. Alternatively, some vendors offer a software version that can be loaded onto a local server as a virtual machine to use as a low-cost appliance.
Lastly, a few vendors have begun offering hosted acceleration. Instead of placing an appliance at the customer sites, the devices are hosted at a POP near each branch office to accelerate the connection over the Internet cloud to a POP near headquarters.
While bandwidths continue to grow, application performance issues never go away. Latencies don’t change and applications chew up extra bandwidth as quickly as it can be provisioned, while users become ever more demanding for responsive applications. Whether you run a big network or a small one, WAN acceleration is quickly becoming a critical component to take full advantage of the network.