52: GreyBeards talk software defined storage with Kiran Sreenivasamurthy, VP Product Management, Maxta

This month we talk with an old friend from Storage Field Day 7 (videos), Kiran Sreenivasamurthy, VP of Product Management for Maxta. Maxta has a software defined storage solution which currently works on VMware vSphere, Red Hat Virtualization and KVM to supply shared, scale out storage and HCI solutions for enterprises across the world.

Maxta is similar to VMware’s vSAN software defined storage whose licenses can be transferred from one server to another, as you upgrade your data center over time. As software defined storage, Maxta runs on any standard Intel X86 hardware. Indeed, Maxta has one customer running two Super Micro servers and one Cisco server in the same cluster.

Maxta advantages

One item that makes Maxta unique is all of its storage properties are assignable at a VM granularity. That is,  replication, deduplication, compression and even blocksize can all be enabled/set at the VMDK-VM level.  This could be useful for environments supporting diverse applications, such as having a 64K block size for Microsoft Exchange and 4K block size for web servers.

Another advantage is their multi-hypervisor support. Maxta’s support for RH Virtualization, VMware and KVM offers the unique ability to migrate storage and even powered off VMs, from one hypervisor to another. Maxta’s file system is the same for both VMware and KVM clusters.

Maxta clusters

Their software must be licensed on all servers in a vSphere or KVM cluster with access to Maxta storage. The minimum Maxta cluster size is 3 nodes for 2-way replication and 5 nodes for 3-way replication.  Most Maxta systems run on 8 to 12 server node clusters. But Maxta has installations with 20 to 24 nodes in customer deployments.

Maxta supports SSD only as well as SSD-disk hybrid storage. And SSDs can be NVMe as well as SATA SSD storage. In hybrid configurations, Maxta SSDs are used as read and write back caches for disk storage.

Maxta supports compute only nodes, compute-storage nodes and witness only nodes (node with 1 storage device). In addition, besides heterogeneous server support, Maxta clusters can have nodes with different storage capacities. Maxta will optimize VM data placement to balance IO activity across heterogeneous nodes.

Maxta provides a vCenter plugin so VMware admins can manage and monitor their storage inside vSphere environment. Maxta also offers a Cloud Connect MX which is a cloud based system allowing for management of all your Maxta clusters through out an enterprise, wherever they reside.

Even HCI, through partners

For customers wanting an HCI solution, Maxta partners can supply pre-tested, HCI appliances or can configure Maxta software with servers at customer data centers. Maxta has done well OEMing their solution, and one significant success has been their OEM deal with Lenovo in China and East Asia, where they sell HCI appliances with Maxta software.

Maxta has also found success with managed service providers (that want to deploy the software on their own hardware), and SME & ROBO environments. Also Maxta seems to be doing very well in Latin America as well as previously mentioned China.

The podcast runs ~42 minutes. Kiran is knowledgeable individual and has worked with some of the leading storage companies of the last two decades.  Listen to the podcast to learn more.

Kiran Sreenivasamurthy, VP Product Management, Maxta

Kiran Sreenivasamurthy is the Vice President of Product Management for Maxta Inc. He has developed and managed storage hardware and software products for more than 20 years with leading storage companies and startups including HP 3PAR, NetApp and Mendocino Software.

Kiran Manages all aspects of Maxta’s hyperconvergence product portfolio from inception through revenue.

51: GreyBeards talk hyper convergence with Lee Caswell, VP Product, Storage & Availability BU, VMware

Sponsored by:

VMware

In this episode we talk with Lee Caswell (@LeeCaswell), Vice President of Product, Storage and Availability Business Unit, VMware.  This is the second time Lee’s been on our show, the previous one back in April of last year when he was with his prior employer. Lee’s been at VMware for a little over a year now and has helped lead some significant changes in their HCI offering, vSAN.

VMware vSAN/HCI business

Many customers struggle to modernize their data centers with funding being the primary issue. This is very similar to what happened in the early 2000s as customers started virtualizing servers and consolidating storage. But today, there’s a new option, server based/software defined storage like VMware’s vSAN, which can be deployed for little expense and grown incrementally as needed. VMware’s vSAN customer base is currently growing by 150% CAGR, and VMware is adding over 100 new vSAN customers a week.

Many companies say they offer HCI, but few have adopted the software-only business model this entails. The transition from a hardware-software, appliance-based business model to a software-only business model is difficult and means a move from a high revenue-lower margin business to a lower revenue-higher margin business. VMware, from its very beginnings, has built a sustainable software-only business model that extends to vSAN today.

The software business model means that VMware can partner easily with a wide variety of server OEM partners to supply vSAN ReadyNodes that are pre-certified and jointly supported in the field. There are currently 14 server partners for vSAN ReadyNodes. In addition, VMware has co-designed the VxRail HCI Appliance with Dell EMC, which adds integrated life-cycle management as well as Dell EMC data protection software licenses.

As a result, customers can adopt vSAN as a build or a buy option for on-prem use and can also leverage vSAN in the cloud from a variety of cloud providers, including AWS very soon. It’s the software-only business model that sets the stage for this common data management across the hybrid cloud.

VMware vSAN software defined storage (SDS)

The advent of Intel Xeon processors and plentiful, relatively cheap SSD storage has made vSAN an easy storage solution for most virtualized data centers today. SSDs removed any performance concerns that customers had with hybrid HCI configurations. And with Intel’s latest Xeon Scalable processors, there’s more than enough power to handle both application compute and storage compute workloads.

From Lee’s perspective, there’s still a place for traditional SAN storage, but he sees it more for cold storage that is scaled independently from servers or for bare metal/non-virtualized storage environments. But for everyone else using virtualized data centers, they really need to give vSAN a look.

Storage vendors shifting sales

It used to be that major storage vendor sales teams would lead with hardware appliance storage solutions and then move to HCI when pushed. The problem was that a typical SAN storage sale takes 9 months to complete and then 3 years of limited additional sales.

To address this, some vendors have taken the approach where they lead with HCI and only move to legacy storage when it’s a better fit. With VMware vSAN, it’s a quicker sales cycle than legacy storage because HCI costs less up front and there’s no need to buy the final storage configuration with the first purchase. VMware vSAN HCI can grow as the customer applications needs dictate, generating additional incremental sales over time.

VMware vSAN in AWS

Recently, VMware has announced VMware Cloud in AWS.What this means is that you can have vSAN storage operating in an AWS cloud just like you would on-prem. In this case, workloads could migrate from cloud to on-prem and back again with almost no changes. How the data gets from on-prem to cloud is another question.

Also the pricing model for VMware Cloud in AWS moves to a consumption based model, where you pay for just what you use on a monthly basis. This way VMware Cloud in AWS and vSAN is billed monthly, consistent with other AWS offerings.

VMware vs. Microsoft on cloud

There’s a subtle difference in how Microsoft and VMware are adopting cloud. VMware came from an infrastructure platform and is now implementing their infrastructure on cloud. Microsoft started as a development platform and is taking their cloud development platform/stack and bringing it to on-prem.

It’s really two different philosophies in action. We now see VMware doing more for the development community with vSphere Integrated Containers (VIC), Docker Containers, Kubernetes, and Pivotal Cloud foundry. Meanwhile Microsoft is looking to implement the Azure stack for on-prem environments, and they are focusing more on infrastructure. In the end, enterprises will have terrific choices as the software defined data center frees up customers dollars and management time.

The podcast runs ~25 minutes. Lee is a very knowledgeable individual and although he doesn’t qualify as a Greybeard (just yet), he has been in and around the data center and flash storage environments throughout most of his career. From his diverse history, Lee has developed a very business like perspective on data center and storage technologies and it’s always a pleasure talking with him.  Listen to the podcast to learn more.

Lee Caswell, V.P. of Product, Storage & Availability Business Unit, VMware

Lee Caswell leads the VMware storage marketing team driving vSAN products, partnerships, and integrations. Lee joined VMware in 2016 and has extensive experience in executive leadership within the storage, flash and virtualization markets.

Prior to VMware, Lee was vice president of Marketing at NetApp and vice president of Solution Marketing at Fusion-IO (now SanDisk). Lee was a founding member of Pivot3, a company widely considered to be the founder of hyper-converged systems, where he served as the CEO and CMO. Earlier in his career, Lee held marketing leadership positions at Adaptec, and SEEQ Technology, a pioneer in non-volatile memory. He started his career at General Electric in Corporate Consulting.

Lee holds a bachelor of arts degree in economics from Carleton College and a master of business administration degree from Dartmouth College. Lee is a New York native and has lived in northern California for many years. He and his wife live in Palo Alto and have two children. In his spare time Lee enjoys cycling, playing guitar, and hiking the local hills.

50: Greybeards wrap up Flash Memory Summit with Jim Handy, Director at Objective Analysis

In this episode we talk with Jim Handy (@thessdguy), Director at Objective Analysis,  a semiconductor market research organization. Jim is an old friend and was on last year to discuss Flash Memory Summit (FMS) 2016. Jim, Howard and I all attended FMS 2017 last week  in Santa Clara and Jim and Howard were presenters at the show.

NVMe & NVMeF to the front

Although, unfortunately the show floor was closed due to fire, there were plenty of sessions and talks about NVMe and NVMeF (NVMe over fabric). Howard believes NVMe & NVMeF seems to be being adopted much quicker than anyone had expected. It’s already evident inside storage systems like Pure’s new FlashArray//X, Kamanario and E8 storage, which is already shipping block storage with NVMe and NVMeF.

Last year PCIe expanders and switches seemed like the wave of the future but ever since then, NVMe and NVMeF has taken off. Historically, there’s been a reluctance to add capacity shelves to storage systems because of the complexity of (FC and SAS) cable connections. But with NVMeF, RoCE and RDMA, it’s now just an (40GbE or 100GbE) Ethernet connection away, considerably easier and less error prone.

3D NAND take off

Both Samsung and Micron are talking up their 64 layer 3D NAND and the rest of the industry following. The NAND shortage has led to fewer price reductions, but eventually when process yields turn up, the shortage will collapse and pricing reductions should return en masse.

The reason that vertical, 3D is taking over from planar (2D) NAND is that planar NAND can’t’ be sharing much more and 15nm is going to be the place it stays at for a long time to come. So the only way to increase capacity/chip and reduce $/Gb, is up.

But as with any new process technology, 3D NAND is having yield problems. But whenever the last yield issue is solved, which seems close,  we should see pricing drop precipitously and much more plentiful (3D) NAND storage.

One thing that has made increasing 3D NAND capacity that much easier is string stacking. Jim describes string stacking as creating a unit, of say 32 layers, which you can fabricate as one piece  and then layer ontop of this an insulating layer. Now you can start again, stacking another 32 layer block ontop and just add another insulating layer.

The problem with more than 32-48 layers is that you have to (dig) create  holes (connecting) between all the layers which have to be (atomically) very straight and coated with special materials. Anyone who has dug a hole knows that the deeper you go, the harder it is to make the hole walls straight. With current technology, 32 layers seem just about as far as they can go.

3DX and similar technologies

There’s been quite a lot of talk the last couple of years about 3D XPoint (3DX) and what it  means for the storage and server industry. Intel has released Octane client SSDs but there’s no enterprise class 3DX SSDs as of yet.

The problem is similar to 3D NAND above, current yields suck.  There’s a chicken and egg problem with any new chip technologies. You need volumes to get the yield up and you need yields up to generate the volumes you need. And volumes with good yields generate profits to re-invest in the cycle for next technology.

Intel can afford to subsidize (lose money) 3DX technology until they get the yields up, knowing full well that when they do, it will become highly profitable.

The key is to price the new technology somewhere between levels in the storage hierarchy, for 3DX that means between NAND and DRAM. This does mean that 3DX will be more of between memory and SSD tier than a replacement for for either DRAM or SSDs.

The recent emergence of NVDIMMs have provided the industry a platform (based on NAND and DRAM) where they can create the software and other OS changes needed to support this mid tier as a memory level. So that when 3DX comes along as a new memory tier they will be ready

NAND shortages, industry globalization & game theory

Jim has an interesting take on how and when the NAND shortage will collapse.

It’s a cyclical problem seen before in DRAM and it’s a question of investment. When there’s an oversupply of a chip technology (like NAND), suppliers cut investments or rather don’t grow investments as fast as they were. Ultimately this leads to a shortage and which then leads to  over investment to catch up with demand.  When this starts to produce chips the capacity bottleneck will collapse and prices will come down hard.

Jim believes that as 3D NAND suppliers start driving yields up and $/Gb down, 2D NAND fabs will turn to DRAM or other electronic circuitry whichwill lead to a price drop there as well.

Jim mentioned game theory is the way the Fab industry has globalized over time. As emerging countries build fabs, they must seek partners to provide the technology to produce product. They offer these companies guaranteed supplies of low priced product for years to help get the fabs online. Once, this period is over the fabs never return to home base.

This approach has led to Japan taking over DRAM & other chip production, then Korea, then Taiwan and now China. It will move again. I suppose this is one reason IBM got out of the chip fab business.

The podcast runs ~49 minutes but Jim is a very knowledgeable, chip industry expert and a great friend from multiple  events. Howard and I had fun talking with him again. Listen to the podcast to learn more.

Jim Handy, Director at Objective Analysis

Jim Handy of Objective Analysis has over 35 years in the electronics industry including 20 years as a leading semiconductor and SSD industry analyst. Early in his career he held marketing and design positions at leading semiconductor suppliers including Intel, National Semiconductor, and Infineon.

A frequent presenter at trade shows, Mr. Handy is known for his technical depth, accurate forecasts, widespread industry presence and volume of publication. He has written hundreds of market reports, articles for trade journals, and white papers, and is frequently interviewed and quoted in the electronics trade press and other media.  He posts blogs at www.TheMemoryGuy.com, and www.TheSSDguy.com

49: Greybeards talk open convergence with Brian Biles, CEO and Co-founder of Datrium

Sponsored By:

In this episode we talk with Brian Biles, CEO and Co-founder of Datrium. We last talked with Brian and Datrium in May of 2016 and at that time we called it deconstructed storage. These days, Datrium offers a converged infrastructure (C/I) solution, which they call “open convergence”.

Datrium C/I

Datrium’s C/I  solution stores persistent data off server onto data nodes and uses onboard flash for a local, host read-write IO cache. They also use host CPU resources to perform some other services such as compression, local deduplication and data services.

In contrast to hyper converged infrastructure solutions available on the market today, customer data is never split across host nodes. That is data residing on a host have only been created and accessed by that host.

Datrium uses on host SSD storage/flash as a fast access layer for data accessed by the host. As data is (re-)written, it’s compressed and locally deduplicated before being persisted (written) down to a data node.

A data node is a relatively light weight dual controller/HA storage solution with 12 high capacity disk drives. Data node storage is global to all hosts running Datrium storage services in the cluster. Besides acting as a permanent repository for data written by the cluster of hosts, it also performs global deduplication of data across all hosts.

The nice thing about their approach to CI is it’s easily scaleable — if you need more IO performance just add more hosts or more SSDs/flash to servers already connected in the cluster. And if a host fails it doesn’t impact cluster IO or data access for any other host.

Datrium originally came out supporting VMware virtualization and acts as an NFS datastore for VMDKs.

Recent enhancements

In July, Datrium released new support for RedHat and KVM virtualization alongside VMware vSphere. They also added Docker persistent volume support to Datrium. Now you can have mixed hypervisors KVM, VMware and Docker container environments, all accessing the same persistent storage.

KVM offered an opportunity to grow the user base and support Redhat enterprise accounts  Redhat is a popular software development environment in non-traditional data centers. Also, much of the public cloud is KVM based, which provides a great way to someday support Datrium storage services in public cloud environments.

One challenge with Docker support is that there are just a whole lot more Docker volumes then VMDKs in vSphere. So Datrium added sophisticated volume directory search capabilities and naming convention options for storage policy management. Customers can define a naming convention for application/container volumes and use these to define group storage policies, which will then apply to any volume that matches the naming convention. This is a lot easier than having to do policy management at a volume level with 100s, 1000s to 10,000s distinct volume IDs.

Docker is being used today to develop most cloud based applications. And many development organizations have adopted Docker containers for their development and application deployment environments. Many shops do development under Docker and production on vSphere. So now these shops can use Datrium to access development as well as production data.

More recently, Datrium also scaled the number of data nodes available in a cluster. Previously you could only have one data node using 12 drives or about 29TB raw storage of protected capacity which when deduped and compressed gave you an effective capacity of ~100TB. But with this latest release, Datrium now supports up to 10 data nodes in a cluster for a total of 1PB of effective capacity for your storage needs.

The podcast runs ~25 minutes. Brian is very knowledgeable about the storage industry, has been successful at many other data storage companies and is always a great guest to have on our show. Listen to the podcast to learn more.

Brian Biles, Datrium CEO & Co-founder

Prior to Datrium, Brian was Founder and VP of Product Mgmt. at EMC Backup Recovery Systems Division. Prior to that he was Founder, VP of Product Mgmt. and Business Development for Data Domain (acquired by EMC in 2009).

48: Greybeards talk object storage with Enrico Signoretti, Head of Product Strategy, OpenIO

In this episode we talk with Enrico Signoretti, Head of Product Strategy for OpenIO, a software defined, object storage startup out of Europe. Enrico is an old friend, having been a member of many Storage Field Day events (SFD) in the past which both Howard and I attended and we wanted to hear what he was up to nowadays.

OpenIO open source SDS

It turns out that OpenIO is an open source object storage project that’s been around since 2008 and has recently (2015) been re-launched as a new storage startup. The open source, community version is still available and OpenIO has links to downloads to try it out. There’s even one for a Raspberry PI (Raspbian 8, I believe) on their website.

As everyone should recall object storage is meant for multi-PB data storage environments. Objects are assigned an ID and are stored in containers or buckets. Object storage has a flat hierarchy unlike file systems that have a multi-tiered hierarchy.

Currently, OpenIO is in a number of customer sites running 15-20PB storage environments. OpenIO supports AWS S3 compatible protocol and OpenStack Swift object storage API.

OpenIO is based on open source but customer service and usability are built into the product they license to end customers  on a usable capacity basis. Minimum license is for 100TB and can go into the multiPB range. There doesn’t appear to be any charge for enhancements of additional features or additional cluster nodes.

The original code was developed for a big email service provider and supported a massive user community. So it was originally developed for small objects, with fast access and many cluster nodes. Nowadays, it can also support very large objects as well.

OpenIO functionality

Each disk device in the OpenIO cluster is a dedicated service. By setting it up this way,  load balancing across the cluster can be at the disk level. Load balancing in OpenIO, is also a dynamic operation. That is, every time a object is created all node’s current capacity is used to determine the node with the least used capacity, which is then allocated to hold that object. This way there’s no static allocation of object IDs to nodes.

Data protection in OpenIO supports erasure coding as well as mirroring (replication{. This can be set by policy and can vary depending on object size. For example, if an object is say under 100MB it can be replicated 3 times but if it’s over 100MB it uses erasure coding.

OpenIO supports hybrid tiering today. This means that an object can move from OpenIO residency to public cloud (AWS S3 or BackBlaze B2) residency over time if the customer wishes. In a future release they will support replication to public cloud as well as tiering.  Many larger customers don’t use tiering because of the expense. Enrico says S3 is cheap as long as you don’t access the data.

OpenIO provides compression of objects. Although many object storage customers already compress and encrypt their data so may not use this. For those customers who don’t, compression can often double the amount of effective storage.

Metadata is just another service in the OpenIO cluster. This means it can be assigned to a number of nodes or all nodes on a configuration basis. OpenIO keeps their metadata on SSDs, which are replicated for data protection rather than in memory. This allows OpenIO to have a light weight footprint. They call their solution “serverless” but what I take from that is that it doesn’t use a lot of server resources to run.

OpenIO offers a number of adjunct services besides pure object storage such as video transcoding or streaming that can be invoked automatically on objects.

They also offer stretched clusters where an OpenIO cluster exists across multiple locations. Objects can have dispersal-like erasure coding for multi-site environments so that if one site goes down you still have access to the data. But Enrico said you have to have a minimum of 3 sites for this.

Enrico mentioned one media & entertainment customer stored only one version of a video in the object storage but when requested in another format automatically transcoded it in realtime. They kept this newly transcoded version in a CDN for future availability, until it aged out.

There seems to be a lot of policy and procedural flexibility available with OpenIO but that may just be an artifact of running in Linux.

They currently support RedHat, Ubuntu and CentOS. They also have a Docker container in Beta test for persistent objects, which is expected to ship later this year.

OpenIO hardware requirements

OpenIO has minimal hardware requirements for cluster nodes. The only thing I saw on their website was the need for at least 2GB of RAM on each node.  And metadata services seem to require SSDs on multiple nodes.

As discussed above, OpenIO has a uniquely light weight footprint (which is why it can run on Raspberry PI) and only seems to need about 500MB of DRAM and 1 core to run effectively.

OpenIO supports heterogeneous nodes. That is nodes can have different numbers and types of disks/SSDs on them, different processor, memory configurations and OSs. We talked about the possibility of having a node go down or disks going down and operating without them for a month, at the end of which admins could go through and fix them/replacing them as needed. Enrico also mentioned it was very easy to add and decommission nodes.

OpenIO supports a nano-node, which is just an (ARM) CPU, ram and a disk drive. Sort of like Seagate Kinetic and other vendor Open Ethernet drive solutions. These drives have a lightweight processor with small memory running Linux accessing an attached disk drive.

Also, OpenIO nodes can offer different services. Some cluster nodes can offer metadata and object storage services and others only object storage services. This seems configurable on a server basis. There’s probably some minimum number of metadata and object services required in a cluster. Enrico mentioned three nodes as a minimum cluster.

The podcast runs ~42 minutes but Enrico is a very knowledgeable, industry expert and a great friend from multiple SFD/TFD events. Howard and I had fun talking with him again. Listen to the podcast to learn more.

Enrico Signoretti, Head of Product Strategy at OpenIO.

In his role as head of product strategy, Enrico is responsible for the planning design and execution of OpenIO product strategy. With the support of his team, he develops product roadmaps from the planning stages to development to ensure their market fit.

Enrico promotes OpenIO products and represent the company and its products at several industry events, conferences and association meetings across different geographies. He actively participates in the company’s sales effort with key accounts as well as by exploring opportunities for developing new partnerships and innovative channel activities.

Prior to joining OpenIO, Enrico worked as an independent IT analyst, blogger and advisor for six years, serving clients among primary storage vendors, startups and end users in Europe and the US.

Enrico is constantly keeping an eye on how the market evolves and continuously looking for new ideas and innovative solutions.

Enrico is also a great sailor and an unsuccessful fisherman.