A user complains about it taking too long to load apps. Another says they like to read the newspaper while they wait for the next web page to appear. Someone else notes that database queries are taking too long.
- Solving I/O Blender Effect With Caching
- Updates And Drivers
- Fragmentation Prevention
- Bonus Tip: Archiving
Such issues are familiar to most data center managers. And the traditional approach to boosting performance is to add more hardware: upgrading memory, deploying the newest generation of servers with the latest processors, or add all-flash arrays. These are smart things to do, and they will almost always make a difference. But they may not stop angry calls from the user community.
“Simply having more processors, ports, sockets, drive bays, or PCIe slots does not mean more effective performance: as more hardware is added, overhead to communicate and maintain data consistency can also increase,” said Greg Schulz, an analyst with StorageIO Group of Stillwater, MN, author of Software-Defined Data Infrastructure Essentials by CRC Press. “That is where well-defined and well-implemented software along with performance-enhancing data services come into play.”
In this article, we took a look at several software approaches to boosting data center and server performance.
Modern CPUs must be supported by the right I/O architecture to ensure fast performance. But with many virtual machines (VMs) and associated applications consolidated on a single host server, I/O activity is aggregated and causes problems.
“This aggregation results in multiple I/Os converging on common resources, all mixed up together in what some call the I/O blender effect,” said Schulz.
That’s why you sometimes see databases such as SQL unable to keep up with spikes in the volume of inquiries and orders, even when servers have ample CPU, memory, networking, and storage resources. Some try to remedy this with yet more hardware, or splitting their databases into multiple instances on separate machines. But that entails cost and IT time.
One way around this is server-side caching. It provides a place to store active data temporarily in order to reduce access delays, eliminate latency, and boost I/O. There are many ways to do this, most of which involved adding hardware. But there are also software approaches such as Infinio Accelerator (Cambridge, MA), Condusiv V-locity (Burbank, CA), and Nutanix PernixData (San Jose, CA).
Chuck Keith, director of infrastructure at Supreme Lending, having spent abundantly on hardware to solve his performance challenges, looked to see if there was another way.
Supreme Lending is a mortgage banker and broker with over 100 branches. The data center at its headquarters in Dallas consisted of 25 Dell PowerEdge hosts running 250 VMs and some aging Dell Compellent disk storage arrays. The company’s loan origination system sits on top of a Microsoft SQL Server database. However, loan officers said some queries could take as long as 20 minutes, and up to one minute moving from screen to screen within the app.
Initially, the company bought two hybrid flash arrays (roughly 20% solid state drives and 80% hard disk drives) by Nimble Storage (San Jose, CA). Data center personnel moved the SQL database and loan origination system onto the new arrays with other systems remaining on the Compellent arrays. This made a big difference, but Keith was surprised to discover that it didn’t solve the problem.
Query times fell from 20 minutes to the five- to 10-minute range. Traveling from screen to screen dropped from a minute down to less than ten seconds.
A further hardware fix would entail upgrading to all-flash arrays from Nimble. Taking hard disk drives out of the equation completely would reduce latency considerably. But budget restraints only allowed the company to add another hybrid array as a means of retiring the Compellent units. The company then looked to software caching.
“We examined server-side caching players such as PernixData and Infinio, but opted for V-locity as it came out ahead on our proof of concept on performance and price,” said Keith. “It does more than server-side caching without requiring any additional hardware.”
He explained that V-locity dynamically caches hot reads with idle DRAM i.e., putting idle DRAM to work by serving frequently requested reads without memory contention or resource starvation. In addition, the software automatically offloads I/O from the underlying storage and streamlines the remaining I/O traffic. This is a direct attack on the I/O blender effect: eliminating the many small writes and reads, and replacing them with large contiguous writes. This makes it possible for more payload to be carried by every I/O operation.
“I/O reduction has taken strain off storage on the back end,” said Keith. “V-locity brought our query times down from up to 10 minutes down to 30 seconds, application load times from 30 seconds to less than 10 seconds, and virtually eliminated any wait time moving between screens.”
Storage tiering has been with us for decades. These days, many data center managers opt to use flash as a top tier, with faster SAS disks, slower SATA disks, and perhaps tape making up lower tiers of storage. But micro-tiering is beginning to appear. It shifts the focus of tiering from space savings onto getting more work done, lowering latency, and upping performance.
Schulz explained that micro-tiering enables all or part of a volume, file, or object to be promoted or demoted from a slow to a fast tier. It moves data at a finer level of granularity and at much smaller intervals compared to regular tiering. This is accomplished based on various policies.
How is caching different from micro-tiering? Unlike cache, which can sometimes result in stale reads, a micro-tier is consistent. With cache, a copy of data is kept close to where it is being used to improve subsequent I/Os, such as in DRAM. Traditional tiering, on the other hand, migrates data from one tier to another. Micro-tiering, then, is a hybrid between cache and tiering.
“You can make your assets more productive via micro-tiering,” said Schulz.
Available options for micro-tiering include Enmotus (Aliso Viejo, CA) and Microsoft Storage Spaces Direct (S2D) (Redmond, WA) for Windows Servers. Enmotus Storage Automation and Analytics software profiles, for example, use volumes. It identifies how much data is active (in real time and historically), and discovers what portion of a volume is active.
“Knowing this, users can plan their purchases to match requirements, eliminating guesswork,” said Andy Mills, CEO of Enmotus.
This software fix is a simple one. Yet it is often neglected amid the hustle and bustle of data center life — installing the latest updates and ensuring that the right settings are in play.
“Verify that your servers, hypervisors, or operating systems have their power settings set to high-performance mode,” said Schulz.
Windows servers, for example, may experience degraded performance as they are often found to be running the “Balanced” power plan, which happens to be the default setting. A “Balanced” power setting enables energy conservation by scaling the processor performance based on current CPU utilization. However, this can result in increased average response time for some tasks, as well as slows experienced by CPU-intensive applications. This setting can impact both physical and virtual environments.
The solution is to switch to the “High Performance” power plan. This is a smart thing to do on heavily utilized servers. However, Microsoft cautions that it may not be wise if a server has a period of high utilization during the day followed by long periods of low use. Read Jose Berreto’s blog at http://bit.ly/2vL5vHo on the Microsoft site for more information on Windows Server power settings and how to change them.
Such actions should be supplemented by checking all servers to verify that you have the newest (and typically fastest) device drivers and the latest software updates installed. For example, the VMware SAS driver is a good general purpose disk driver for guest VMs. However, the VM Para Virtualized disk driver can show an improvement in performance while using less CPU.
“Different device drivers can make a difference in performance for virtual as well as cloud environments,” said Schulz.
Himanshu Singh, group manager, product marketing, Cloud Platform Business Unit, VMware (Palo Alto, CA), added a couple of ways to increase virtual server performance without new or additional hardware. Those running older versions of vSphere (such as 5.5 or 6.0), for example, can generate a performance boost by upgrading to vSphere 6.5. Singh believes this delivers 6X performance improvement over vSphere 5.5 and 2X performance improvement over vSphere 6.0.
“In our testbed, we measured performance in operations per second, where operations include powering-on VMs, clones, vMotions, etc.,” said Singh. “In version 5.5, vCenter was capable of approximately 10 operations per second, that increased to 30 in 6.0, and vCenter is capable of more than 60 operations per second in 6.5.”
Additionally, VMware offers a service called vSphere Optimization Assessment (VOA). This free assessment is available to all current vSphere users. The resulting reports include recommendations on how to improve performance, and increase capacity utilization.
Schulz noted a phenomenon known as split I/O in which a program requests data in a size that is too large to fit into a single request. But the most common way for this split I/O phenomenon is when the disk is fragmented.
“When looking at server I/O and application performance, split I/O can be an indicator of the need to optimize a storage system, device, file system, or database,” said Schulz. “Other signs include long response times and queries.”
The time-honored way to address this is via defragmentation. Windows servers and virtual servers can suffer badly from having their files splintered into tiny pieces and spread all over the disk. As a result, it can take ages to open some files due to the architecture of the Windows operating system whereby the OS becomes progressively slower the longer it is used. The more you add software and large volumes of storage, the worst the machine runs. The traditional remedy has been to defragment the hard drive. Defragmentation tools such as Diskeeper by Condusiv, SQL Defrag Manager by Idera (Houston), and SQL Index Manager by Redgate (Pasadena, CA) have been the traditional remedy.
Unfortunately, defragmentation becomes unwieldly and unworkable in many data centers. Many production servers operating within mission critical storage environments can’t be taken offline during the night or weekend to address fragmentation. So what to do? The latest version of Diskeeper has switched its approach from defragmentation to server fragmentation prevention. Instead of picking up and consolidating the pieces once a volume has become fragmented, the latest methodology is to prevent fragmentation before data is written to the server. This works particularly well on physical and virtual MS-SQL servers. The software intervenes between the operating system, the hypervisor, and the storage to ensure contiguous writes as a means of improving I/O and throughput.
There is one exception. A file that is already badly fragmented can be defragmented without stopping the server. According to the iometer test, this fragmentation prevention approach is able to boost MS-SQL workloads on physical servers by anywhere from 2X to 6X.
Archiving is mainly associated with regulatory or compliance use cases. In this sense, it forms a way to create and store an original and unalterable copy of data. It is also looked upon as a means of reducing the overall data footprint. These are all valid reasons to archive, but there is a performance side, too.
For example, once physical disk capacity reaches a certain level, performance can become sluggish. The hard drive is thrashing around looking for data in amongst material that is never or very rarely accessed. This introduces inefficiency and delays. It is best to dump data into a lower tier such as a slower disk, the cloud, or even tape rather than have it clog up production servers and storage arrays.
“If a server has relatively little storage capacity left, this greatly hampers performance,” said Schulz.” Archiving can be a simple technique to address old data issues that cause performance problems by offloading little-used data.”