By Louis Imershein, VP Products and Wayne Salpietro, Director of Marketing Permabit Technology Corporation

Louis Imershein, VP Products, Permabit Technology Corp.

Louis Imershein, VP Products, Permabit Technology Corp.

The cloud continues to dominate IT as businesses make their infrastructure decisions based on cost and agility. Public cloud, where shared infrastructure is paid for and utilized only when needed, is the most popular model today. However, more and more organizations are addressing security concerns by creating their own private clouds. As businesses deploy private cloud infrastructure, they are adopting techniques used in the public cloud to control costs. Gone are the traditional arrays and network switches of the past, replaced with software-defined data centers running on industry standard servers.

Wayne Salpietro, Director of Marketing, Permabit Technology Corp.

Wayne Salpietro, Director of Marketing, Permabit Technology Corp.

Features which improve efficiency make the cloud model more effective by reducing costs and increasing data transfer speeds. One such feature which is particularly effective in cloud environments is inline data reduction. This is a technology that can be used to lower the costs of data in transit and at rest. In fact, data reduction delivers unique benefits to each model of cloud deployment.

Public Clouds

A public cloud’s raison d’etre is its ability to deliver IT business agility, deployment flexibility and elasticity. As a result, new workloads are increasingly deployed in public clouds. Worldwide public IT cloud service revenue in 2018 is predicted to be $127B.

Data-reduction technology minimizes public cloud costs. For example, deduplication and compression can cut capacity requirements of block storage in enterprise public cloud deployments by as much as five-sixths. These savings are realized in reduced storage consumption and operating costs in public cloud deployments.

Consider AWS costs employing data reduction; if you provision 300 TB of EBS General Purpose SSD (gp2) storage for 12 hours per day over a 30-day month in a region that charges $0.10 per GB per month, you will be charged $15,000 for the storage.

With data reduction, that monthly cost of $15,000 would be cut to $2,500. Over a 12-month period you will save $150,000. Capacity planning is a simpler problem when the volume of data is one-sixth of its former size. The bottom line is that data reduction increases agility and decreases the cost of public clouds.

One data-reduction application that can readily be applied in public cloud is Permabit’s Virtual Disk Optimizer (VDO), which is a pre-packaged software solution that installs and deploys in minutes on Red Hat Enterprise Linux and Ubuntu LTS Linux distributions. To deploy VDO in Amazon AWS, the administrator provisions Elastic Block Storage (EBS) volumes, installs the VDO package into their VMs, and applies VDO to the block devices represented for their EBS volumes. Since VDO is implemented in the Linux device mapper, it is transparent to the applications installed above it.

As data is written out to block storage volumes, VDO applies three reduction techniques:

  1. Zero-block elimination uses pattern matching techniques to eliminate 4 KB zero-blocks.
  2. Inline Deduplication eliminates 4 KB duplicate blocks.
  3. HIOPS Compression™ compresses the remaining blocks.

VDO data reduction processing

This approach results in remarkable six to one, or approximately 85% data reduction rates across a wide range of data sets.

Private Cloud

Organizations see similar benefits when they deploy data reduction in their private cloud environments.  Private cloud deployments are selected over public because they offer the increased flexibility of the public cloud model but keep privacy and security under the organization’s control. IDC predicts that in 2017 there will be $17.2B in infrastructure spending on private cloud, including on-premises and hosted private clouds.

One problem that data reduction addresses for the private cloud is that when implementing a private cloud, organizations can get hit with the double whammy of hardware infrastructure costs plus annual software licensing costs. For example, Software Defined Storage (SDS) solutions are typically licensed by capacity, and their costs are directly proportional to hardware infrastructure storage expenses. Data reduction decreases storage costs because it reduces storage capacity consumption.

Consider a private cloud configuration with a 1 PB deployment of storage infrastructure and SDS.  Assuming a current hardware cost of $500 per TB for commodity server-based storage infrastructure with datacenter-class SSDs, and a cost of $56,000 per 512 TB for the SDS component, users would pay $612,000 in the first year. In addition, with annual software subscriptions, over three years you will spend $836,000 for 1 PB of storage, and over five years, $1,060,000.

The same configuration with six to one data reduction in comparison over five years will cost $176,667 for hardware and software, resulting in $883,333 in savings. That is not including the additional substantial savings in power cooling and space. As businesses develop private cloud deployments, they must be sure to leverage data-reduction capabilities to take advantage of the compelling cost savings.

When implementing private cloud on Linux, a virtual data optimizer (VDO) is an easy way to include data reduction. The VDO operates in the Linux kernel as one of many core data-management services, and is a device-mapper target driver which is transparent to persistent and ephemeral storage services, whether the storage layers above are providing object, block, compute, or file-based access.

VDO – Seamless and Transparent Data Reduction

VDO - Seamless and Transparent Data Reduction

The same transparency applies to the applications running above the storage service level. Customers using VDO today realize significant savings across a wide range of use cases.

Workflows that benefit heavily from data reduction include logging messages and events for systems and applications; monitoring, alert and tracing systems; databases with textual content and NOSQL approaches such as MongoDB and Hadoop; user data such as home directories and development environments; virtualization and containers, and; live system backups used for rapid disaster recovery.

The cumulative cost savings that can be achieved across a wide range of use cases with data reduction makes it highly attractive for private cloud deployments.

Reducing Hybrid Cloud’s Highly Redundant Data

Storage is at the foundation of cloud services, and data in the cloud must be replicated for data safety. Hybrid cloud architectures that combine on-premise resources (such as private cloud) with colocation, private and multiple public clouds result in highly redundant data environments. IDC’s 2016 FutureScape report finds that by the end of this year, more than 80 percent of enterprise IT organizations will commit to hybrid cloud architectures encompassing multiple public cloud services, as well as private clouds.

Depending on a single cloud storage provider for storage services can risk failing to achieve SLA targets. Consider the widespread AWS S3 storage errors that occurred on February 28 2017, where data was not available to clients for several hours: Because of the loss of data access, businesses may have lost millions of dollars of revenue. As a result, more enterprises are pursuing a “Cloud of Clouds” approach where data is redundantly distributed across multiple clouds for data safety and accessibility. Unfortunately, however, because of data redundancy, this approach increases storage capacity consumption and cost.

That’s where data reduction comes in. In hybrid cloud deployments where data is replicated to the participating clouds, data reduction multiplies capacity and cost savings. If three copies of the data are kept in three different clouds, three times as much is saved. Take the private cloud example above where data reduction drove down the costs of a 1 PB deployment to $176,667, resulting in $883,333 in savings over five years. If that PB is replicated in three different clouds, the savings would be multiplied by three for a total of $2,649,999.

Data-reduction technologies like Permabit’s Virtual Data Optimizer (VDO) provide the perfect solution to address the multi-site storage capacity and bandwidth challenges faced in hybrid cloud environments. Advanced data reduction capabilities have the same impact on bandwidth consumption as they do on storage, and translate to a reduction in network bandwidth consumption and associated cost of five-sixths. Because VDO operates at the device level, it can sit above block-level replication products to optimize data before it is written out and replicated.


IT professionals are finding that the future of IT infrastructure lies in the cloud. Data reduction technologies enable clouds — public, private and hybrid — to deliver on their promise of safety, agility and elasticity at the lowest possible cost, making cloud the deployment model of choice for IT infrastructure going forward.

About Louis Imershein
Louis Imershein, Vice President of Product, is responsible for product management and strategic planning. He has 25 years of technical leadership experience in product management, software development and technical support. Prior to Permabit, Imershein was a Senior Product Marketing Manager for the Sun Microsystems Data Management Group. Before joining Sun, he was at SCO where his titles included Product Manager, Software Architect and Senior Technical Lead. Imershein has a Bachelor’s degree in Biological Science from the University of California, Santa Cruz and has earned certificates in software design, device driver development, system administration, and software testing.

About Wayne Salpietro
Wayne Salpietro is the director of product and social media marketing at data storage and cloud backup services provider Permabit Technology Corp. He has served in this capacity for the past eight years, prior to which he held product marketing and managerial roles at CA, HP, and IBM.