The term “software-defined” has been somewhat misused over the years, but was initially meant to represent storage systems that didn’t have a tight coupling between hardware and software. In effect, the customer could buy hardware and deploy the software they wanted to run on it.
Over the past 10 years of software-defined storage (SDS), we’ve seen a significant evolution of the technology, moving on from the simplistic definition of hardware and software separation. Today’s systems are more focused on tighter integration with the application, automation and using the latest wave of fast hardware.
Initially, software-defined storage was represented by a number of characteristics:
- Hardware abstracted: This is a distinct decoupling of the relationship between software and hardware. Effectively, this abstraction is allows the use of commodity components. Removing hardware dependence also means abstracting the constructs used to store data, such as LUNs, file shares and repositories.
- Automated: The ability to drive storage functions (such as provisioning or data protection) through code, either as part of a CLI or API. Storage traditionally has GUIs, whereas an application programming interface (API) allows automation of software functions.
- Policy driven: Implementation of service-based metrics to storage, such as quality of service and data placement. This determines throughput as well as features such as availability and resiliency.
- Scalability: This should offer the capability to scale up and down for performance and capacity.
The move towards private and public cloud has likely made the latter three characteristics more important. The cloud model is more focused on the service-based delivery of storage resources to applications, although the ability to use commodity hardware is still an important factor.
Software-defined storage evolution
Over the past 10 years, we’ve seen some discrete developments in software-defined storage, which can be categorised over a number of generational changes.
Although there’s no official definition, we can look at these steps as a way of understanding how SDS has changed to meet the market needs.
SDS 1.0 – Software appliances
The first stage of software-defined storage was to take the software already being used in an appliance and sell it separately from the hardware.
Most of these offerings exist as dual-controller and scale-up architectures. The first-generation offerings can typically also be installed in a hypervisor as a virtual storage appliance (VSA) that runs as the original appliance would.
The main value here is one of cost saving in being able to use commodity hardware. Deployment inside a VSA allows storage services to be rolled out in smaller branch offices, obviating the need for dedicated storage hardware. Supplier examples here include Nexenta, Open-E and StorMagic.
SDS 2.0 – Custom software
Generation two of software-defined storage is addressed by systems specifically designed to be software products. Many of these are scale-out, allowing capacity and performance to be increased by adding extra nodes into a cluster of devices.
These products natively cover block and object, with some support for file protocols. For block, there is HPE StoreVirtual and Dell EMC ScaleIO. For object, Scality, Caringo and IBM Cleversafe.
We could also define here a subset of the 2.0 products that are software-defined but generally sold as hardware appliances. This seems to work against the logic of software-defined storage, but products such as SolidFire and Nasuni do offer SDS capabilities even when sold mainly as a hardware offering.
SDS 3.0 – Greater abstraction
In the third generation of products we see a greater abstraction away from the underlying hardware. Data is more widely distributed across available nodes, with the ability to mix and match hardware across a storage cluster. Typical examples of 3.0 implementations include Storpool and Primary Data.
In the version 3 category we can start to include hyper-converged products, such as VMware Virtual SAN and Nutanix’s Platform Services, which deliver block and file storage for VMs and containers.
There are also suppliers adapting their software for hyper-converged, such as Cisco HyperFlex (previously Springpath), Hedvig, Maxta and Hive-IO USX (previously Atlantis Computing).
Hyper-converged has become a big part of software-defined storage, allowing IT organisations to eliminate dedicated storage hardware and associated costs.
The impact of containers
The adoption of container technology has provided startups an opportunity to deliver new products. We have seen the emergence of storage for containers, including storage built with containers.
A container was meant to be a short-term application runtime environment. However, we’ve increasingly seen a need to be able to run containers for extended periods of time and to provide persistent storage to a container.
Startups such as Portworx and StorageOS are delivering storage for containers in a container, allowing the data to move with the container across multiple physical hosts.
Innovation in tech
New technology is providing an opportunity for startups to create innovative software-defined storage systems.
NVMe flash storage offers high-speed, low-latency connections that can be stretched across fabrics such as Fibre Channel, Infiniband and Ethernet.
Excelero and E8 Storage have built block-based systems that are effectively software-based and can use the customer’s own hardware or the supplier’s offering.
Both create what is termed a “disaggregated” storage architecture, in that the services normally integrated into an appliance are distributed to each client connecting into the storage system. In the case of E8, this is a shared hardware appliance, whereas Excelero can also operate in a hyper-converged model.
Weka.IO, Qumulo and Elastifile use flash hardware to build scale-out file systems. All of these supplier systems can be implemented on commodity hardware or on public cloud.
Datrium has built a software-defined storage product that separates data performance and capacity requirements. Datrium’s DVX implements input/output (I/O) performance using flash local to the application server, while delivering capacity and resilience through storing data on a shared appliance.
Public cloud storage
Finally, we should highlight the way in which the public cloud is affecting storage.
Public cloud suppliers already have their block and object offerings, such as S3 and Blob storage. Microsoft has chosen to partner with NetApp and implement Azure Enterprise NFS using ONTAP technology, which completely abstracts the implementation from the user, while integrating storage as a “first-class citizen” in Azure.
Many suppliers now offer storage products through public cloud markets, including NetApp ONTAP Cloud, SoftNAS and Actifio.
Secondary storage suppliers, such as Rubrik and Cohesity, have provided cloud-based implementations of their products, both as a backup target and to run secondary workloads in the public cloud. CTERA and Nasuni use the public cloud to provide a distributed storage file system that can be accessed globally.
Public cloud changes the consumption model from one of capital acquisition to an operational expenditure (opex) charge. This has significant cost saving for IT departments. With automation, storage can be expanded dynamically in the public cloud providing on-demand resources without over-provisioning.
The range of software-defined storage offerings is now as wide as it is deep. As we move into a hybrid cloud world, software-defined storage systems will play an increasingly important role, enabling data to be moved inside and between clouds, with minimal impact.
The term “software-defined” is starting to become irrelevant as storage starts to be delivered by software by default and dedicated appliances play a shrinking role.