Developments in Kubernetes object storage support

Object storage is fast becoming a solution of choice for storing massive amounts of unstructured data.

The popularity of object storage is due in part to how it can scale efficiently.  This in particular sets it apart from file and block as users can quickly expand their storage footprint with much less overhead.  Testing has shown that Ceph Object can ingest up to one billion objects, spread across ten thousand buckets “with zero operational or data consistency challenges.”  The stability, scalability, and sheer capacity of object storage has made it the ideal solution for technologies that can generate massive amounts of data at a time.

So what does this mean for Kubernetes storage?

To answer what this means for Kubernetes storage, let’s define what we mean by objects. “Objects” are comprised of data, metadata usually including policy and user information, and a globally unique identifier.  These objects are stored in a single, flattened layer of containers, called buckets.  Data within a bucket is directly accessed through a network entrypoint, with the object identifier being part of the address.  Generally, object stores can contain tens of thousands of buckets, each holding tens or hundreds of thousands of objects.  Instead of interacting with a filesystem or block device, users send and receive data through the object store’s RESTful interface, typically accessible from a single network entrypoint.

Historically, object storage has not had core support in Kubernetes, leaving users to either seek out third party solutions or roll their own.  In either case, this may have some unintended consequences.  Vendors don’t typically provide generic object store support, instead developing Kubernetes APIs tailored to their platforms.  This can mean users are likely to find themselves locked in to the first storage platform they adopt.  Those who choose to build their own solutions take on a significant amount of technical debt.

Things are starting to change

The Emerging Technology group within Red Hat’s Office of the CTO has spent over a year working to fill this need.  Initially, our efforts led to the development of the lib-bucket-provisioner, a Golang library encapsulating a Kubernetes controller and a set of Custom Resource Definitions (CRD).  Similar to PersistentVolumeClaims and PersistentVolumes, we hoped to provide a generic object storage API that would allow cluster users to dynamically provision object store buckets, with a method that’s agnostic of the underlying provider.

The lib-bucket-provisioner library was conceived in part to give developers a low barrier to entry when working in Kubernetes. All Kubernetes controller logic is provided behind the API, such that developers are only required to code provisioning operations against their object store’s interface.  Lib-bucket-provisioner is currently in use in the Rook project and Red Hat OpenShift Container Storage.

The goal of the lib-bucket-provisioner project was initially to support rook-ceph.  At the time, contributors had already added automation for managing Ceph clusters, object stores, and users, leaving bucket provisioning as the next logical addition.  Ceph, as well as other object store platforms in Rook, implements an S3 interface.  Thus, a solution that supported S3 bucket management would benefit Rook in general. The project took inspiration from the Kubernetes external storage library, sig-storage-lib-external-provisioner. This library has since been deprecated in favor of Container Storage Interface (CSI).

The lib-bucket-provisioner defines two CRDs: an Object Bucket Claim (OBC) and Object Bucket (OB).  The OBC is a namespaced, user-facing API and represents a request by a user for a bucket.  OBs are the automation / admin facing APIs that track the lifecycle of a provisioned bucket.  Once a bucket is created by a backend store, the OB is “bound” to the OBC.  At this point, a ConfigMap and a Secret are generated in the OBC’s namespace with the connection and credentials necessary to access the storage endpoint.  Pods can then reference these objects to inject the access information into their workloads.

In order to provide the framework with the logic necessary for bucket lifecycle operations, vendors must implement a Go interface, which has only four methods: Provision(), Delete(), Grant(), and Revoke().  The library’s controller uses this interface in conjunction with data from the OBC and OB to create new buckets or grant access to existing ones, and to clean up buckets and credentials that it provisioned dynamically.

A new (a.k.a greenfield) bucket is generated when a user creates an OBC.  The library controller detects this object, creates an OB, and calls the Provision() method.  This method is expected to return the connection endpoint as well as credential information, which is written to the OB.  Once the OB is updated, the controller then generates a ConfigMap and Secret in the namespace of the OBC.

For existing (a.k.a brownfield) buckets, an OB must be created by hand.  This provides cluster admins a method for exposing these pre-existing buckets to cluster users.  A user would then create their OBC with the OB name specified.  The controller can infer from this pre-bound OBC that the user only requires access to the storage endpoint and calls the Grant() method.  The Grant() logic should generate an identity with access to the bucket and return the credentials.  This data is then written back to the OBC’s namespace just as is done for greenfield buckets.

Adoption and moving forward

The object storage storage library has been useful within Rook and has seen some adoption in a handful of projects.  However, there are aspects of the design that have hampered its popularity.

The library’s rapid pace of development made it difficult to stay up to date with the latest releases, and API changes would require dependents to increment their versions.  Because the library directly depended on Kubernetes’ components, importers that also depended on Kubernetes repeatedly found themselves in a dependency hell (prior to the release of Go modules) with mismatched Kubernetes versions.

The use of ConfigMaps and Secrets to deliver connection data to Pods was a necessary but cumbersome step in the design that required extra work from users. That said, it has ultimately served its purpose in accelerating Rook-Ceph object storage adoption, and more importantly, demonstrated that there was an appetite for such an API in Kubernetes.

The Kubernetes storage community has taken interest in the need for a standardized object storage API.  Engineers in the Emerging Technology team are collaborating with community members to design a fresh interface, taking into account the complexities of individual cloud providers and stand alone object store platforms.  Our in-flight Kubernetes Enhancement Proposal is a great starting point for catching up on latest developments in this new model and a main thread for the ongoing dialogue.

With support from Kubernetes’ Storage Special Interest Group (SIG), we hope to introduce the first official Kubernetes object storage API. In the near term, we expect this API to integrate with container workloads by utilizing CSI ephemeral volumes, managed by a custom CSI driver.  As the project matures, we can look forward to object storage becoming a first class Kubernetes citizen, with complete integration into workload manifests.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s