Communication between distributed software components in a cloud-native application is an important and challenging aspect of cloud-native development. This post introduces a solution to that problem using Virtual Application Networks (VANs). A VAN can be set up by a developer and used to connect the components of an application that are deployed in different public, private, and edge cloud environments.
Cloud-native development is about writing software in such a way that it can be deployed easily, flexibly, and automatically into the hybrid-cloud ecosystem to take advantage of the scale of the cloud. A big part of taking advantage of cloud scale is the ability to deploy components of a distributed system in different locations.
To date, the architecture of distributed software has been dictated to a large degree by the structure of the underlying network. We see this most clearly in the predominant client-server pattern where applications are partitioned into clients that run on devices in private IP networks and servers that run in large data centers in the public network. If the application calls for a component in one private network to communicate directly with components in other private networks, the application will be designed with an intermediary server deployed somewhere in the public Internet. The code will be written to explicitly communicate via the intermediary.
In modern cloud-native development, we believe software should be partitioned into separate services based on how the developer expects to scale and deploy. For example, front-end web-servers should be separate from back-end processors, which should be separate from services that contain valuable or sensitive data. This allows front ends to be deployed near the users; the back ends to be scaled to handle the current workload; and the data kept safe in a private enterprise network.
The promise of cloud-native is flexibility in the deployment of an application:
- Do I have customers in Asia? I’ll deploy a front-end service in Asia so those customers can experience lower latency
- Is there a burst of activity in my business? I’ll scale up the capacity of my application temporarily so my customers don’t experience a delay
- Did my public cloud provider in central Europe go offline? I’ll handle that workload at a different provider or in a different availability zone
- This database holds valuable and sensitive information which I don’t want to store in the public cloud. I’ll deploy it in my private network
- I am testing a new feature for my application. I’d like to use my IDE on my laptop to run that service with live data from my production application
The legacy TCP/IP internet is fast and reliable at moving data from one point to another, but it is not well suited to supporting the modern needs of cloud-native software systems.
The Virtual Application Network
A Virtual Application Network (VAN) is an application-layer network that is overlaid on the TCP/IP-based internet. A VAN can be as small as a single laptop or as big as a global network spanning time zones and continents. A VAN can be set up quickly by a non-privileged developer and its topology can be added to and changed just as easily. Applications written to use existing protocols like HTTP, gRPC, AMQP, or anything that runs over TCP or UDP can run on a VAN without modification.
A VAN is established by deploying a VAN router in each site that will be part of the distributed application and by establishing secure connections between pairs of sites. A site can be a LAN in a data center, a single host, or a namespace in a Kubernetes cluster.As an example, the Skupper Project is a free and open-source implementation of a VAN built for Kubernetes.
Addressing and service discovery
The true power of virtual application networking comes from the way a VAN does addressing and service discovery. There are a number of important differences between VAN addressing and the internet’s TCP/IP addressing:
A VAN address is a DNS-style name
In a VAN, the name is the address, whereas in a TCP/IP network, the name is mapped to an IP address using a Domain Name System (DNS) lookup. The VAN routes application traffic based on the VAN address without regard for the underlying IP addresses that are used by the systems hosting the application.
A VAN address references a process, not a host
A VAN address references a running process whereas an IP address references the host on which the process runs. To reference a process in TCP/IP, a port number (a 16-bit numeric field) must be added to the address to denote the process on the addressed host.
The developer of a cloud-native application does not want to have to think about the hosts or the ports involved in running the application services. Modern cloud computing abstracts this detail away. Services may migrate from host to host. In multi-site situations, the hosts themselves may not be addressable from everywhere in the network because they are in private IP networks or behind firewalls.
VAN addresses are multi-access
IP addresses are unicast, which means an IP address refers to a single host. (It is understood that both IPv4 and IPv6 support multicast, and IPv6 supports anycast, but in practical terms these features are not used in wide area networks.) VAN addresses are assumed to be multi-access, with multiple destinations using the same address. Forwarding semantics can be either multicast, where each destination receives a copy of all traffic; or anycast, where the network balances the traffic load across the set of destinations.
In a VAN, multicast traffic is efficiently distributed to all destinations across the whole (possibly global) network. Anycast traffic is balanced across all of the available destinations based on the actual real-time backlog of work in those destinations. This provides adaptive balancing to the destination or server that will provide the best latency for each particular request.
Multi-access routing frees the developer from having to build scaling into the application code itself. Every service developed for an application can be scaled and distributed at run-time as needed.
Because the service name is the address used for routing in the VAN, service discovery is simplified. If three services are sharing a particular workload and a fourth is brought online, the VAN starts using the fourth service when it’s available. There is no need for clients of the service to be involved in that activity. Similarly, if the services in AWS ‘us-west’ fail, services in Azure ‘West US’ can seamlessly pick up the workload because the VAN knows where all the instances of the service are attached and can route to the ones closest to the demand.
One of the key benefits of VAN addressing is that the topology of the application (i.e., which components interact with which other components) is independent from the topology of the underlying network. The developer does not need to consider the network topology when creating the application. This allows the network topology to be changed dynamically underneath the application. The application can be developed and tested on a single laptop, then portions of the application can be deployed in private or public cloud locations. As the application evolves, the scale and geographic reach of the underlying network can be increased to accommodate its needs.
The cloud-native environment introduces problems for security that developers are not well equipped to deal with, and not all developers have the benefit of full security and operations teams. Simply deploying applications in public clouds, on virtual hosts, opens attack vectors that a developer cannot easily anticipate. Deploying applications to multiple cloud locations compounds the complexity of the situation.
An application running on a VAN benefits from the security mechanisms inherent to the network. All of the inter-site connections in a VAN are locked down using mutual TLS (Transport Layer Security) with a private, dedicated certificate authority. Access from the outside world only occurs where the developer wishes to provide public access to a portal or web front-end. There is no incidental exposure of services to the internet.
Ease of Use
One of the best attributes of a VAN is that it is a user-space technology. A VAN can quickly and easily be stood up by a developer without elevated privileges, Kubernetes cluster access, or admin access. As long as the developer has access to a set of Kubernetes namespaces, those namespaces can participate in a VAN.
As cloud-native development becomes more prevalent and distributed applications are increasingly written to take advantage of the promise of hybrid-cloud infrastructure, new abstractions for communication will be needed to back up that promise. Client/Server addressing from TCP/IP is not going to be sufficient to meet the need. The application-level addressing and forwarding provided by a Virtual Application Network will provide the advanced communication capabilities that are needed to make the most of modern cloud computing systems.
If you would like to learn more and try this for yourself, feel free to take one of the getting-started examples from the Skupper project for a spin on your own Kubernetes namespaces.