If you run software on someone’s servers, you have a problem. You can’t be sure your data and code aren’t being observed, or worse, tampered with — trust is your only assurance. But there is hope, in the form of Trusted Execution Environments (TEEs) and a new open source project, Enarx, that will make use of TEEs to minimize the trust you need to confidently run on other people’s hardware. This article delves into this problem, how TEE’s work and their limitations, providing a TEE primer of sorts, and explaining how Enarx aims to work around these limitations. It is the next in a series that started with Trust No One, Run Everywhere–Introducing Enarx.
A number of multi-cloud orchestrators have promised to simplify deploying hundreds or thousands of high-availability services. But this comes with massive infrastructure requirements. How could we possibly manage the storage needs of a thousand stateful processes? In this blog, we’ll examine how we can leverage these orchestrators to address our dynamic storage requirements.
Currently in Kubernetes, there are two approaches in how a control plane can scale resources across multiple clusters. These are commonly referred to as the Push and Pull models, referring to the way in which configurations are ingested by a managed cluster. Despite being antonyms in name, these models are not mutually exclusive and may be deployed together to target separate problem spaces in a managed multi-cluster environment.
With an increase in the number of applications being deployed on Red Hat OpenShift, there is a strong need for application monitoring. A number of these applications are monitored via Prometheus metrics, resulting in an accumulation of a large number of time-series metrics stored in a TSDB (time series database). Some of these metrics can have anomalous values, which may indicate issues in the application, but it is difficult to identify them manually. To address this issue, we came up with an AI-based approach of training a machine-learning model on these metrics for detecting anomalies.
When developing a new technology, it really helps if you are also a user of that new tech. This has been an approach of Red Hat around artificial intelligence and machine learning — develop openly on one hand, exchanging knowledge across the organization to use the same tools in the other hand to work on interesting business problems. All while keeping a two-way exchange to and from the open source commons.
This is the sort of left-hand/right-hand move that data scientist Oindrilla Chatterjee began using as part of a project she originally started during an internship, then later in a full-time role at Red Hat. Chatterjee and her team are looking at how to do sentiment analysis using machine learning on a dataset consisting of customer and partner surveys regarding a service offering.
Teams from Red Hat and NVIDIA have collaborated on creating a scalable hybrid cloud application that could revolutionize smart city initiatives such as traffic-flow monitoring and transportation management around the world. By working together, the two companies are creating solutions that make cities smarter and more efficient by taking sensor data and processing it in real-time to provide insights for traffic congestion, pedestrian flow, and infrastructure maintenance.
Running on top of the NVIDIA EGX platform with the NVIDIA GPU Operator, the application is built with NVIDIA’s Metropolis application framework for IoT that brings together innovative capabilities for real-time image processing where NVIDIA DeepStream SDK is used to extract metadata from live video streams at the edge. It then forwards the right metadata to the cloud for deeper analytical processing and further representation in an information dashboard depicted below.
Operators within Kubernetes are useful tools, designed to extend the container orchestration platform with additional resources. More directly, an Operator, sometimes referred to as custom controllers, is a method of packaging, deploying, and managing a Kubernetes application.
As useful as Operators are, they have had one limitation: originally they all had to be written in the Go programming language. Thanks to the Operator SDK, you do not need to develop your Operators in Go. The Operator SDK has options for Ansible and Helm that may be better suited for the way you or your team work. But, it can still be limiting for dev teams trying to build an operator if they don’t happen to be skilled in Helm or Ansible.
A well-known tactic for figuring out how to identify the root cause of a problem that has caused an outage in a production environment is to go back and see what the environment has been doing so far. Through the analysis of logs, developers and operators alike can determine usage information that ideally reveal what’s wrong with a given application or how it can be improved to work better.
In the early days of logging, there wasn’t a great deal of activity going on, so it was possible for a human being (or two) to examine such logs and figure out what was up. It didn’t hurt that the logs were not only sparse in content, but also not terribly complicated in terms of what they reported. Alerts such as “Help, my processor is melting” really didn’t take a lot to figure out how to fix. Applications now are more distributed and that further complicates the situation. But over time, logs got far more voluminous and more detailed in what they were reporting.
Quick, name some weird stuff that’s happened to your production machines.
Accidentally dropping a production database table? Rolling out a patch that enabled any user to log in with any password? Disabling a load balancer? Using a dictionary to physically keep keyboard keys depressed so “terminals [could] repeatedly [hit] the enter key in order for the logins and print jobs of about 40,000 people to work”?
It’s happened to Alex Corvin, a senior engineer at Red Hat. Well, not that last one. But Corvin has been around long enough in his career to have met Mr. Murphy and his Law: if it can go wrong, it will.
The prospect of true machine learning is a tangible goal for data scientists and researchers. It has been long known that the platform on which such ML apps can run have to be fast and hyper efficient so that learning can be that much faster. This is the motivation for Red Hat engineers in the Office of the CTO who are working to optimize such an open source platform: Open Data Hub.
Open Data Hub is built on Red Hat OpenShift Container Platform, Ceph Object Storage, and Apache Kafka/Strimzi integrated into a collection of open source projects to enable a machine-learning-as-a-service platform. That’s a lot of components to be integrated, and to ensure that their contributions to Open Data Hub perform well, Red Hat engineers have taken the step of creating an Internal Data Hub within Red Hat as a proving ground and learning environment.
When you run a workload as a VM, container or in a serverless environment, that workload is vulnerable to interference by any person or software with hypervisor, root or kernel access. Enarx, a new open source project, aims to make it simple to deploy workloads to a variety of trusted execution environments (TEEs) in the public cloud, on your premises or elsewhere, and to ensure that your application workload is as secure as possible.
When you run your workloads in the cloud, there are no technical barriers to prevent the cloud providers–or their employees–from looking into your workloads, peeking into the data, or even changing the running process. That’s because when you run a workload as a VM, container or serverless, the way that these are implemented means that a person or software entity with sufficient access can interfere with any process running on that machine.