Managing application and data portability at scale with Rook-Ceph

by | Feb 17, 2020 | Hybrid Cloud

One of the key requirements for Kubernetes in multi-cluster environments is the ability to migrate an application with all of its dependencies and resources from one cluster to another cluster. Application portability gives application owners and administrators the ability to better manage applications for common needs such as scaling out applications, high availability for applications, or just simply backing up applications for disaster recovery. This post is going to present one solution for enabling storage and data mobility in multicluster/hybrid cloud environments using Ceph and Rook.

Containerization and Container Native Storage has made it easier for developers to run applications and get the storage they need, but as this space evolves and matures it is becoming increasingly important to move your application and data around, from cluster to cluster and cloud to cloud.

With Ceph’s ability to run containerized within a Kubernetes cluster or as a stand-alone cluster external to Kubernetes, it is in a position to facilitate data movement between any Ceph cluster including existing brownfield clusters. Ceph mirroring is asynchronous, making it suitable for geographically distributed locations or high latency networks. This adds to Ceph’s ability to run in the cloud on top of Amazon Elastic Block Store (EBS) or Google Persistent Disk (GPD), creating uniformity across the different clouds for storage capabilities and features.

As mentioned in Scaling workload storage requirements across clusters, the core component and starting point of multicluster management is establishing a control plane that can orchestrate resources between clusters. Currently, application and data mobility in Kubernetes is addressed with KubeFed and with the addition of Volume Snapshot support and Volume Clone support.

KubeFed allows for a centralized control plane to manage and push Kubernetes primitive resources to other clusters. Snapshotting takes a point in time view of the data and stores it as a persistent volume that can be restored, while Cloning gives the ability to duplicate your data making a copy of the volume. Although this space is rapidly evolving, the core concepts, independent of the technology chosen (KubeFed, push, pull, or hybrid-push-pull), remain the same.

There are many other full-service solutions to the application and data portability problem in the multi-cluster space. However, for this blog we are going to take a deeper look at Rook Ceph and how it is addressing the needs for application and data portability. Ceph is a Software Defined Storage (SDS) technology that is native to containerized environments and offers a full portfolio of storage needs including filesystem, block, and object storage. Ceph is deployed and managed using the Rook operator framework (rook-ceph).

What is Rook?

Rook is a multi-service storage Operator designed to handle the orchestration and complexity of providing non-cloud based storage in a Kubernetes environment. It offers several full storage features such as Min-IO, EdgeFS, and Ceph (CephFS, Ceph RBD, and Ceph-Object). In addition to these features it also offers many other data service capabilities such as operators to handle stateful workloads like Cassandra and CockroachDB, as well as object service libraries and operators like the ObjectBucket operator. For a deeper dive into all that Rook offers take a look at the Rook project

Let’s focus on one particular use case where a traditional application is running on Cluster A using a MySQL database backed by Ceph Rados Block Device (RBD) storage. We need to move our application and its data from Cluster A to Cluster B.

Shows an application App A running on a cluster Cluster A with an independent Ceph RBD instance. The image displays the contents of each cluster, with Cluster A including App A, Ceph CSI, Ceph RBD, the Rook Operator Framework, and a PV; Cluster B is just prepared with the Rook Operator Framework, Ceph CSI, and Ceph RBD.

Many of the full service solutions mentioned above utilize at least some of the built in Kubernetes components like Snapshotting and Cloning that can help achieve this data mobility. However, they are limited in scope to restoring the volume into a single namespace. Ceph offers another possible solution to this problem, called RBD mirroring. We extend the relevant attributes of Ceph RBD and use those in a containerized Ceph function.

Mirroring allows for independent Ceph RBD instances to form a common storage pool, which can then mirror data in one or both directions between the RBD instances. Rook and its operator framework, utilizing the features of Ceph and Ceph Container Storage Interface (CSI), can help facilitate this functionality in a more automated fashion.

Although Ceph mirroring has been a stable feature for quite some time, it is traditionally a manual process. With the addition of Ceph CSI drivers into Rook, and recent and near-future enhancements coming to Rook-Ceph, this process is expected to be completely automated. Later on in this blog we will touch on how this might happen and what it could potentially look like, but for now, let’s continue working through some more of the key features of Rook-Ceph that help to achieve this data mobility.

Rook calling Ceph

As Rook deploys Ceph, it has the ability to bootstrap other Ceph RBD clusters to form a trusted storage pool. This storage pool is the common link that manages the Ceph storage images. Once this pool is established, mirroring can be enabled and any data created in Cluster A should be automatically journaled and duplicated in Cluster B.

Note you can have more than two clusters set up for mirroring, but typically there is always a primary and replica-type relationship.

We now have a fully mirrored storage pool with RBD images, and our working application running in Cluster A represents these RBD images as Kubernetes Persistent Volumes (PV) and Persistent Volume Claims (PVC).

Shows the bootstrapping across two clusters to form a common Storage Pool to enable mirroring of data. Image shows a new bridge between the Ceph RBD instances of the two clusters.

Now we have established our pool and enabled mirroring and, at this point, anything stored in our mirrored images from Cluster A are automatically duplicated to Cluster B. Now it’s time to move our application, but as we do this, we also need to connect our new application to the storage resources (the RBD images we are mirroring) so when the application runs, it can utilize the mirrored data.

Another planned feature targeted for Rook-Ceph would allow our application specifications to point to a Kubernetes StorageClass and a Ceph CSI provisioner, which itself allows for static provisioning. What does this mean? In normal provisioning of volumes in Kubernetes a developer requests storage for their application via a PVC and a new PV volume is created. This feature will allow the CSI Driver to determine if there is already an RBD image that we want to create a PV from, and not to create a completely new storage volume.

Our application can now run, connected to the mirrored PV from Cluster A, with a fully redundant backup of our application running on Cluster B.

App A is now bridged entirely to be mirrored in Cluster B using a StorageClass and Static Provisioning from the PV and RBD image.

Improving automation of Rook-Ceph

What would this look like in a fully automated process, and how would this promote scalability?

As with other Kubernetes custom resources, there is a controller behind the scenes to manage the automation and life cycle of those and other resources. Rook, coupled with the CSI Operators and provisioners, will enable this full cycle of automation, similar to any other Kubernetes request processes.

Imagine you can submit a PVC/StorageClass-like request in Kubernetes but it’s specific for Ceph and RBD mirroring. Within that request are details needed by Rook-Ceph operators to perform the necessary steps and automation needed to complete the end-to-end process of setting up a mirrored storage pool, images, and PVs. Now you can scale up or down, easily moving data from A to B, all controlled by the normal Kubernetes mechanisms. 

Although there are many different solutions and models to solve application and data portability use cases, Rook, plus Ceph, is positioned to be a single solution choice. 

With Ceph’s ability to run containerized within a Kubernetes cluster and/or as a stand-alone cluster external to Kubernetes, it is in a position to facilitate data movement between any Ceph cluster including existing brownfield clusters. Ceph’s RBD mirroring is asynchronous, which makes it suitable for geographically distributed locations or high latency networks. Also Ceph’s ability to run in the cloud on top of EBS (AWS) or GPD (GKE) creates uniformity across the different clouds for storage capabilities and features.

Looking at how Ceph can facilitate data movement between Ceph clusters, includes asynchronous RBD mirroring for geographically distribution and high latency, and is able to run in the public cloud, shows how Ceph provides a much needed uniformity for storage capabilities and features across different clouds

As these and future capabilities are developed, users will continue to have a more enhanced experience to selectively manage their data across clouds and clusters in an efficient and optimized manner. The multi-cluster space is changing rapidly and as technologies and models come and go, storage will always be the hardest and one of the most important aspects of multi-cluster scalability, and Rook-Ceph is here to help.