Managing application and data portability at scale with Rook-Ceph

One of the key requirements for Kubernetes in multi-cluster environments is the ability to migrate an application with all of its dependencies and resources from one cluster to another cluster. Application portability gives application owners and administrators the ability to better manage applications for common needs such as scaling out applications, high availability for applications, or just simply backing up applications for disaster recovery. This post is going to present one solution for enabling storage and data mobility in multicluster/hybrid cloud environments using Ceph and Rook.

Containerization and Container Native Storage has made it easier for developers to run applications and get the storage they need, but as this space evolves and matures it is becoming increasingly important to move your application and data around, from cluster to cluster and cloud to cloud.

Continue reading “Managing application and data portability at scale with Rook-Ceph”

Kiali: An observability platform for Istio

Istio exists to make life easier for application developers working with Kubernetes. But what about making Istio easier? Well, that’s Kiali’s job. Read on to learn more about making Istio even more pleasant to use.
Deploying and managing microservice applications is hard. When you break down an application into components, you add complexity in how those components communicate with each other. Getting an alert when something goes wrong, and figuring out how to fix it, is a challenge involving networking, storage, and potentially dozens of different compute nodes.

Continue reading “Kiali: An observability platform for Istio”

Current Trusted Execution Environment landscape

If you run software on someone’s servers, you have a problem. You can’t be sure your data and code aren’t being observed, or worse, tampered with — trust is your only assurance. But there is hope, in the form of Trusted Execution Environments (TEEs) and a new open source project, Enarx, that will make use of TEEs to minimize the trust you need to confidently run on other people’s hardware. This article delves into this problem, how TEE’s work and their limitations, providing a TEE primer of sorts, and explaining how Enarx aims to work around these limitations. It is the next in a series that started with Trust No One, Run Everywhere–Introducing Enarx.

Continue reading “Current Trusted Execution Environment landscape”

Scaling workload storage requirements across clusters

A number of multi-cloud orchestrators have promised to simplify deploying hundreds or thousands of high-availability services.  But this comes with massive infrastructure requirements. How could we possibly manage the storage needs of a thousand stateful processes?  In this blog, we’ll examine how we can leverage these orchestrators to address our dynamic storage requirements.

Currently in Kubernetes, there are two approaches in how a control plane can scale resources across multiple clusters.  These are commonly referred to as the Push and Pull models, referring to the way in which configurations are ingested by a managed cluster.  Despite being antonyms in name, these models are not mutually exclusive and may be deployed together to target separate problem spaces in a managed multi-cluster environment.

Continue reading “Scaling workload storage requirements across clusters”

Prometheus anomaly detection

With an increase in the number of applications being deployed on Red Hat OpenShift, there is a strong need for application monitoring. A number of these applications are monitored via Prometheus metrics, resulting in an accumulation of a large number of time-series metrics stored in a TSDB (time series database). Some of these metrics can have anomalous values, which may indicate issues in the application, but it is difficult to identify them manually. To address this issue, we came up with an AI-based approach of training a machine-learning model on these metrics for detecting anomalies.

Continue reading “Prometheus anomaly detection”

Sentiment analysis with machine learning

When developing a new technology, it really helps if you are also a user of that new tech. This has been an approach of Red Hat around artificial intelligence and machine learning — develop openly on one hand, exchanging knowledge across the organization to use the same tools in the other hand to work on interesting business problems. All while keeping a two-way exchange to and from the open source commons.

This is the sort of left-hand/right-hand move that data scientist Oindrilla Chatterjee began using as part of a project she originally started during an internship, then later in a full-time role at Red Hat. Chatterjee and her team are looking at how to do sentiment analysis using machine learning on a dataset consisting of customer and partner surveys regarding a service offering.

Continue reading “Sentiment analysis with machine learning”

Red Hat and NVIDIA bring scalable, efficient edge computing to smart cities

Teams from Red Hat and NVIDIA have collaborated on creating a scalable hybrid cloud application that could revolutionize smart city initiatives such as traffic-flow monitoring and transportation management around the world. By working together, the two companies are creating solutions that make cities smarter and more efficient by taking sensor data and processing it in real-time to provide insights for traffic congestion, pedestrian flow, and infrastructure maintenance.

Running on top of the NVIDIA EGX platform with the NVIDIA GPU Operator, the application is built with NVIDIA’s Metropolis application framework for IoT that brings together innovative capabilities for real-time image processing where NVIDIA DeepStream SDK is used to extract metadata from live video streams at the edge. It then forwards the right metadata to the cloud for deeper analytical processing and further representation in an information dashboard depicted below.

Continue reading “Red Hat and NVIDIA bring scalable, efficient edge computing to smart cities”

Passing Go: polyglot Kubernetes Operators

Operators within Kubernetes are useful tools, designed to extend the container orchestration platform with additional resources. More directly, an Operator, sometimes referred to as custom controllers, is a method of packaging, deploying, and managing a Kubernetes application. 

As useful as Operators are, they have had one limitation: originally they all had to be written in the Go programming language. Thanks to the Operator SDK, you do not need to develop your Operators in Go. The Operator SDK has options for Ansible and Helm that may be better suited for the way you or your team work. But, it can still be limiting for dev teams trying to build an operator if they don’t happen to be skilled in Helm or Ansible.

Continue reading “Passing Go: polyglot Kubernetes Operators”

Diagnosing apps with AI

A well-known tactic for figuring out how to identify the root cause of a problem that has caused an outage in a production environment is to go back and see what the environment has been doing so far. Through the analysis of logs, developers and operators alike can determine usage information that ideally reveal what’s wrong with a given application or how it can be improved to work better.

In the early days of logging, there wasn’t a great deal of activity going on, so it was possible for a human being (or two) to examine such logs and figure out what was up. It didn’t hurt that the logs were not only sparse in content, but also not terribly complicated in terms of what they reported. Alerts such as “Help, my processor is melting” really didn’t take a lot to figure out how to fix. Applications now are more distributed and that further complicates the situation. But over time, logs got far more voluminous and more detailed in what they were reporting.

Continue reading “Diagnosing apps with AI”

Managing chaos in a containerized environment

Quick, name some weird stuff that’s happened to your production machines.

Accidentally dropping a production database table? Rolling out a patch that enabled any user to log in with any password? Disabling a load balancer? Using a dictionary to physically keep keyboard keys depressed so “terminals [could] repeatedly [hit] the enter key in order for the logins and print jobs of about 40,000 people to work”?

It’s happened to Alex Corvin, a senior engineer at Red Hat. Well, not that last one. But Corvin has been around long enough in his career to have met Mr. Murphy and his Law: if it can go wrong, it will.

Continue reading “Managing chaos in a containerized environment”