Software defined, hyperscale storage

By Chris Kranz | | Hyperconvergence

Hyperscale storage is a platform that scales to a fairly large limit (we’ve not spoken to many customers who needed more than our currently tested 1000 node and 52PB limit) and all data is distributed across this cluster. Software defined storage is a much muddled term in the industry, but I’ll give you both generally accepted usages:

  1. SDS is a programmable infrastructure which abstracts logical storage constructs from physical hardware
  2. SDS is a software based implementation of storage services which can run on any commodity hardware

(we prefer the second description, but either works)

Tim Stammers (Senior Analyst at 451 Research) has this to say on SDS: “Software defined storage and hyper-converged infrastructure are coming of age. Midmarket companies face mounting challenges storing and managing ever-growing volumes of data. Clearly there is a demand for IT infrastructure that is simpler to manage, and can scale out as data grows. Hyper-converged systems and software-defined storage are strong candidates to meet these needs.”

From various research papers, analyst reports, customer feedback and general industry trends, we see quite a few drivers that are promoting SDS over a traditional SAN.

  • Increased Performance
  • High-Availability
  • Lower Costs
  • Data Protection
  • Scalability
  • Increased use of virtualisation
  • Improve IT operational efficiency
  • Enterprise storage features
  • Flexibility

This brings me onto the reason why I joined Hedvig, lets look at each of these points and see how we meet them…

Increased performance

Due to the scale-out nature of both the storage nodes and the storage proxy, we can scale storage capacity and storage performance independently. Additionally we have no logical controller ceiling as a traditional storage vendor may have, so if you start with a 3 node cluster, you can seamlessly expand this to a 1000 node cluster with no implications to the initial nodes, just keep growing. We intelligently use accelerated storage in the form of SSD, NVMe and RAM to accelerate the workloads where they should be accelerated, no need for expensive all-flash arrays when only 10% of your data needs to be accelerated.

High availability

It almost seems daft pointing this out. I talk repeatedly about this, we are active/active/(*n)/active across multiple sites. Each of our storage nodes is an access layer into our storage fabric, and our minimum recommended replication factor for production data is 3, meaning even in the event of a failure you are still not at risk of data loss or corruption from a second failure. This is core to our DNA. Would the inventor of Cassandra create a storage system that wasn’t highly available and distributed? Definitely not!

Lower Costs

We aren’t in the business of selling you hardware, this means that around 60-70% of the acquisition cost of a traditional SAN or a modern appliance can be reduced from our platform as you can choose the most cost effective and strategically appropriate hardware for you. We don’t have a hardware mark-up (or bezel tax). Our solution is very compelling, even if you forget all the cool features we have, we are still a lower cost solution than either a traditional SAN or a modern appliance.

Data protection

For me this actually goes hand-in-hand with the High Availability point. We don’t do HA that compromises or limits your data protection. Our High Availability and Data Protection are effectively the same technology implementation. And the more you grow our platform, the more data protection and high availability you get. This is also configurable: If you want no data-protection, you can have that. If you want too much data protection, you can also have that. If you want data split across multiple sites, absolutely do that.


Again this is a fundamental part of our DNA. If you want to grow more than 1000 nodes and 52PB of storage, then we’ll have a meeting with engineering. If you want to start at 3 nodes with 20TB but want the ability to grow to 100’s and PB’s, then great, that’s exactly what we’re designed to do. Need to scale storage performance separate from storage capacity? Again, that’s awesome because that’s exactly what we can provide you with.

Increased use of virtualisation

(or Virtualization). Again this is part of our DNA. Whether you want to support and power a VMware cluster, Hyper-V, OpenStack or some other virtualisation platform, or you want to embrace the world of Docker and containers, then we’re with you and we built our platform both using and leveraging all these technologies. We have a virtualisation aware platform that treats virtual machines as first class citizens, not just as another block of data on a LUN.

Improved IT operational efficiency

Our platform is really simple to manage, other than the technical fact that our GUI is not Flash or Java based (yay!), we also have a fully featured API to make the entire platform programmable. We’ve worked closely with VMware, OpenStack and Docker to provide seamless integration into their platforms, to the point that it’s barely noticeable that you actually run Hedvig storage on the backend when doing standard provisioning and management tasks (as it should be!).

Enterprise storage features

This is the one area that impressed me most when first learning about Hedvig. Most of the innovative, modern storage vendors out today doing something different are doing great things with a single feature, maybe fantastically flexible object storage, or a great integration with VMware, but many lack this innovative technology plus the traditional enterprise storage features that we have come to know and love in our storage platforms. Multi-protocol, deduplication & compression, snapshots, clones, replication, auto-tiering, flash acceleration, all these are key things that any modern storage system should have, and we have implemented all these in the most efficient ways possible.


Everyone talks about vendor lock-in, and to be frank I don’t think anyone does a good job of avoiding it without introducing a vendor! We approach this from the hardware layer, you will still use the Hedvig software (and we hope love using it, so that’s not really a compromise), but we give you the flexibility of choice. Firstly choose your preferred server manufacturer and configuration, maybe match it to your hypervisor and application hosts to simplify your hardware support and spares kit, maybe leverage commodity hardware vendors and get the most bang for your buck. Secondly is around the architectural design. Hedvig is not just about hyperscale (despite the context of this post), and if you want hyperconverged we’ll happily support that. If you start with hyperconverged and need to grow your storage completely separate from your compute, we’ll support you adding hyperscale nodes into exactly the same storage cluster. As new hardware and storage technology is released, we’ll have no issues supporting it as we are just a software layer above the hardware, which gives you some level of future proofing over future storage technologies.

If you’d like a deeper dive on how Hedvig’s software-defined storage platform works, then check out my short whiteboard video for a platform overview:

Software defined, hyperscale storage is the future of modern storage, and it’s something we have fully available today. For more details and a discussion about how we can help you modernise your data centre, click the link below.

Get Started

(This post originally appeared on LinkedIn.)