Crimson: evolving Ceph for high performance NVMe

At the time when Ceph was originally designed, the storage landscape was quite different from what we see now. Initially, Ceph was deployed generally on conventional spinning disks capable of a few hundred IOPS of random IO. Since then, storage technology has progressed rapidly through solid-state drives (SSDs) capable of tens of thousands of IOPS to modern NVMe devices capable of hundreds of thousands of IOPS to more than a million. In order to enable Ceph to better exploit these new technologies, the ceph community has begun work on a new implementation of the core ceph-osd component: crimson. The goal of this new implementation is to create a replacement for ceph-osd that minimizes latency and CPU overhead by using high performance asynchronous IO and a new threading architecture designed to minimize context switches and inter-thread communication when handling an operation.

Crimson’s focus is on minimizing CPU overhead and latency—while storage throughput has scaled rapidly, single threaded CPU throughput hasn’t really kept up. With a CPU at around 3 GHz, you’ve got about 20M cycles/IO for a conventional HDD, 300K cycles/IO for an older SSD, but only about 6k cycles/IO for a modern NVMe device. Crimson enables us to rethink elements of Ceph’s core implementation to properly exploit these high performance devices.

Continue reading “Crimson: evolving Ceph for high performance NVMe”

CI/CD without borders: recent improvements to multi-vendor solution testing

Based on the platform nature of many Red Hat products, our engineering teams are in continuous collaboration with many of our ecosystem partners. Recent collaborations in the solution testing have resulted in noticeable improvements to our core partner continuous integration/continuous delivery (CI/CD) tools known as Distributed CI (DCI).

As software systems continue to grow more complex, distributed in development, and dependent upon services and virtualization components from different vendors, CI/CD has become critical to maintaining software stability. While applicable to all customers in general, the individual requirements of the telco customer make working together ahead of multi-vendor solution testing and delivery critical.  More and more, customers expect that marketed partnerships translate into multi-vendor solutions that operate as if they are delivered from one company.

Red Hat’s now+Next blog includes posts that discuss technologies that are under active development in upstream open source communities and at Red Hat. We believe in sharing early and often the things we’re working on, but we want to note that unless otherwise stated the technologies and how-tos shared here aren’t part of supported products, nor promised to be in the future.

Continue reading “CI/CD without borders: recent improvements to multi-vendor solution testing”

Examining mailing list traffic to evaluate community health

Open source software communities have many choices when it comes to modes of communication. Among those choices, mailing lists have been a long standing common choice for connecting with other members of the community. Within mailing lists, the sentiment and communication style can give a good insight into the health of the community. The interactions can become a deciding factor for new and diverse members considering becoming active in the community.

As the focus of diversity and inclusion increases in OSS communities, I have taken on the task of using ML/AI strategies to detect hate speech and offensive language within community mailing lists. This project’s scope is starting with the Fedora devel and user mailing lists and will be transitioned into a service that will be applicable to all OSS mailing lists. In this three-part blog series, we will go through this process step by step. First, the cleaning process, second is model creation, and finishing up the series the creation of a service to be used by managers to be notified of concerning behaviors on their community’s mailing list. It is time to start using data science to help the efforts of D&I.

Continue reading “Examining mailing list traffic to evaluate community health”

Using the Crossplane Operator to manage and provision Cloud Native Services

This post by the Red Hat Office of the CTO seeks to expand on previous work and further explore Crossplane as a Kubernetes Operator for provisioning, managing, configuring, and consuming cloud services. These services can then, in turn, be used to create and deploy cloud-native applications.

In this post, we will be discussing what an enterprise implementation of Crossplane could look like for infrastructure teams and developers. Additionally, we will be creating multiple collections of cloud infrastructure, which seek to abstract away the provisioning and configuration of managed resources. Finally, we will be creating an instance of Quay that will consume the collection of AWS abstracted services and infrastructure.

Note this implementation is not something that’s currently supported but we know people want to read about ideas at earlier stages.

Continue reading “Using the Crossplane Operator to manage and provision Cloud Native Services”

Developments in Kubernetes object storage support

Object storage is fast becoming a solution of choice for storing massive amounts of unstructured data.

The popularity of object storage is due in part to how it can scale efficiently.  This in particular sets it apart from file and block as users can quickly expand their storage footprint with much less overhead.  Testing has shown that Ceph Object can ingest up to one billion objects, spread across ten thousand buckets “with zero operational or data consistency challenges.”  The stability, scalability, and sheer capacity of object storage has made it the ideal solution for technologies that can generate massive amounts of data at a time.

Continue reading “Developments in Kubernetes object storage support”

Quantum on OpenShift – part one, an introduction to quantum computing

Many people are talking about the use and purpose of quantum computing of late, so we wanted to take an opportunity to talk about what Red Hat is doing around quantum computing. This first post will give an overview of a few of Red Hat’s activities with quantum computing, beginning with some background.

The Emerging Technology team in Red Hat’s Office of the CTO have formulated our general goal to define how the classical and the quantum spaces can be connected together. Broadly speaking, our goal is to use the OpenShift Container Platform to run and manage both classical as well as quantum applications, in essence running hybrid workloads in an open hybrid cloud.

Continue reading “Quantum on OpenShift – part one, an introduction to quantum computing”

Managing application consistency and state during Disaster Recovery for Ceph RBD mirroring

This is the third post in our series investigating how Rook-Ceph and RBD Mirroring can be best utilized to handle Disaster Recovery scenarios. The first post in the series, “Managing application and data portability at scale with Rook-Ceph,” laid some foundational groundwork for how Rook-Ceph and RBD mirroring can enable application portability. Then in our second post, “Managing Disaster Recovery with GitOps and Ceph RBD Mirroring,” we talked about some key features of Rook-Ceph RBD mirroring and presented a solution to help manage and automate failover using a GitOps model.

In this post we explore some additional tools and concepts to help with the synchronization of application consistency and state across multiple clusters, reducing the manual steps and providing an automated approach for recoverability and maintainability of the application on failover.

Continue reading “Managing application consistency and state during Disaster Recovery for Ceph RBD mirroring”

Cloud-native software development with Virtual Application Networks

Communication between distributed software components in a cloud-native application is an important and challenging aspect of cloud-native development. This post introduces a solution to that problem using Virtual Application Networks (VANs).  A VAN can be set up by a developer and used to connect the components of an application that are deployed in different public, private, and edge cloud environments.

Cloud-native development is about writing software in such a way that it can be deployed easily, flexibly, and automatically into the hybrid-cloud ecosystem to take advantage of the scale of the cloud.  A big part of taking advantage of cloud scale is the ability to deploy components of a distributed system in different locations.

Continue reading “Cloud-native software development with Virtual Application Networks”

Enarx – project maturity update

It’s been a busy time since we announced Enarx and our vision for running workloads more securely to the world in August 2019.  At the time, we had produced a proof of concept demo, creating and attesting a Trusted Execution Environment (TEE) instance using AMD’s Secure Encrypted Virtualization (SEV) capability, encrypting a tiny workload (literally a few instructions of handcrafted assembly language) and sending it to be executed.  Beyond that, we had lots of ideas, some thoughts about design, and an ambition to extend the work to other platforms.  And since then, a lot has happened, from kicking off the Confidential Computing Consortium to demos with AMD’s SEV and Intel’s Software Guard Extensions (SGX), from contributor improvements to the recent efforts to provide a Wasm module for multiple silicon vendor architectures.

Continue reading “Enarx – project maturity update”

Data integration in the hybrid cloud with Apache Spark and Open Data Hub

In this post we introduce the basics of reading and writing Apache Spark DataFrames to an SQL database, using Apache Spark’s JDBC API.

Apache Spark’s Structured Streaming data model is a framework for federating data from heterogeneous sources. Structured Streaming unifies columnar data from differing underlying formats and even completely different modalities – for example streaming data and data at rest – under Spark’s DataFrame API.

Continue reading “Data integration in the hybrid cloud with Apache Spark and Open Data Hub”