Rook Changes the Kubernetes Storage Landscape

by | Apr 15, 2019 | Hybrid Cloud

It’s no secret that if you want to run containerized applications in a distributed way, then Kubernetes is the platform for you. Kubernetes’ role as an orchestration platform for containers has taken center stage to become a main player for automating deployment, scaling, and management of applications within containers. Red Hat’s own OpenShift Container Platform is a Kubernetes distribution that uses Kubernetes optimized for enterprises.

Storage has been one of the areas of potential optimization. Many containers, by their very nature, are usually small enough to be easily distributed and managed. Containers hold applications, but the data those applications use needs to be held somewhere else, for a number of reasons. Of particular interest in this post, we want to avoid the containers themselves becoming too large and unwieldy to be effectively managed.

Storage has been set aside outside the container platform, with containerized apps saving and retrieving data from there via plugins on the platform’s servers. This is true whether you are using a cloud, on-premise, or hybrid solution.

This approach has its drawbacks, though: storage gains a new set of dependency and management overheads in a deployment, not to mention potential lock-in to a specific storage vendor or solution.

These are the problems Rook is designed to help solve. The idea of this storage orchestration tool, according to Rook maintainer Travis Nielsen, is to have storage hosted within the container platform itself, providing the “framework and support for a diverse set of storage solutions to natively integrate with Kubernetes and OpenShift. No longer does storage need to be external to the platform. Now storage is part of the platform.”

The effect of such a design shift would be immediate: Applications, which would no longer be tied to an external storage solution, would become more portable than ever. Also, dedicated Kubernetes storage clusters would be possible.

Rook is named after the castle-like chess piece that, like a castle, protects its occupants. Rook, along with being protective, uses Kubernetes patterns such as custom types and controllers to make it look and feel like any other Kubernetes application. Specifically, Rook automates a host of storage-layer tasks, such as deployment, bootstrapping, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management. Automation of so many aspects of storage may greatly reduce administrative needs, too.

Rook’s primary storage platform is currently Ceph, a software-defined storage solution providing file, block, and object storage services that has been run by enterprise customers for years and the upstream project for Red Hat Ceph Storage offering. This is not the only storage solution with which Rook can work, but it is the first, so a good deal of progress has been made with Ceph-based storage. The application itself is currently at version 0.9.3, according to Nielsen, with the 1.0 release tentatively planned very soon. Ideally, Nielsen added, some features around OperatorHub.io integration will be present in 1.0 as well.

Nielsen’s description of the project paints a picture of an elegant solution: Rook’s operators, connected to the Kubernetes API, directly manage the daemons of storage solutions (like Ceph) to enable Ceph to deploy and manage “pods.” Applications within the Kubernetes platform consume the storage within those pods via Rook agents that, from the application’s point of view, make the connection to the storage as transparent as it would be if stored externally, off the platform.

The operators within Rook are the key to what makes Rook successful, Nielsen explained. By taking on much of the workloads via automation, a lot of complexity of storage administration that one would expect around orchestrating storage is reduced to manageable levels.

“If there is something that can be automated,” Nielsen said, “then an Operator is the way to do that. The Operator takes the storage settings specified by the admin and automates their configuration so the admin doesn’t have to worry about them.”

Rook’s Ceph Operator, via Rook’s Custom Resource Definitions, can leverage the full power of Kubernetes and be used to manage storage systems at scale, enabling stateful upgrades, cluster rebalancing, and monitoring of storage cluster health. The Operator is also not on the data path of the applications and storage, which means that it can be offline for minutes at a time and still not bring systems to a crashing halt.

Rook is an innovative tool that enables storage for Kubernetes apps through persistent volumes. If you’re interested in this technology either for Kubernetes or OpenShift, check it out on the project’s website, or download the Rook Operator Kit on GitHub.