3 cloud-native challenges: How to get your app behavior in check

Cloud systems have become the infrastructure standard for everyone, from small startups to large-scale enterprises. You can shift any application to the cloud, but if it's not cloud-native, it won't fully leverage the cloud's benefits, such as load balancing and automatic scaling.

Software engineering has traditionally emphasized writing static and non-distributed applications, since most developers have not had the tooling, infrastructure, or need to build distributed systems. Now the necessary tooling and infrastructure are readily available, and high-capacity systems are in demand.

Developers face many challenges when making an app cloud-native, and the process goes against many basic assumptions and go-to habits that our field still teaches about writing software. Specifically, you need to do things differently with regard to handling persistent data, coupling and decoupling services, and reproducing an environment for local development.

However, you first need to understand the requirements for developing cloud-native apps. And to do that, start thinking about how cloud platforms behave, and the implications their behavior has for downstream apps.

What is cloud-native?

The Cloud Native Computing Foundation defines "cloud-native" as follows:

Containerized. Each part (applications, processes, etc.) is packaged in its own container. This facilitates reproducibility, transparency, and resource isolation.
Dynamically orchestrated. Containers are actively scheduled and managed to optimize resource utilization.
Microservices-oriented. Applications are segmented into microservices, which significantly increases the overall agility and maintainability of applications.

The “dynamically orchestrated” part of this definition should be of particular interest to you as a developer, because this is why containers (or at least bundled app images) are ubiquitous in cloud platforms. Beyond just requiring microservices in general, cloud platforms require specific microservice patterns and designs to operate effectively.

[Special Coverage: KubeCon/CloudNativeCon]

The cloud platform handles the basic runtime concerns of the services that run on it, as well as concerns about replication. Dynamic replication is what makes platforms so desirable, but it also gives developers who are new to cloud-native apps headaches. Using this replication creates the following requirements for applications:

Persistent data must be replicated logically within a service. Otherwise, the data is inconsistent, and not persistent at all.
No important state can depend on session pinning. Aside from weaknesses with session pinning, individual containers will be spun down and replaced at some point. Session pinning can be used, but only for using caches or weak persistence.
Networking should happen through cloud platform interfaces. The platform handles discovery and routing from a single endpoint to a suitable container. That interface might be “real,” like an IP address, or “virtual,” like an internal service name.

The key challenges to building cloud-native apps

Traditionally, application development has been about building static and monolithic systems. Many developers are in the habit of bundling together discrete units (e.g., the app and the underlying database) in a virtual machine, and binding data (typically accounts) to a virtual machine as a way to manually “scale” the app to multiple deployments. If you did this, you would likely be stuck in the same pattern, with multiple single-replica deployments of your app on a cloud platform.

To avoid that scenario, you need to explicitly design around the replication principles outlined above while ensuring compatibility with the specific behavior of the target platform. Let's look at three subjects that differ from traditional habits on developing an app:

Handling and isolating persistent data
Coupling services according to interactions
Reproducing the environment for local development

Handling persistent data

Persistent state is an obvious, but huge, snag in creating a cloud-native application. Data is ephemeral by default. It can be bound to a persistent volume, but that’s not sufficient in most cases, since it limits the data store to a single replica.

To replicate the data store, you need to replicate it at the logical level. This is not something most cloud platforms can solve for you, since many assumptions need to hold in order to do filesystem or block-level replication. Supporting this is ultimately up to the IT operations team, but choosing a data store and replication pattern to commit to requires back-and-forth communication about functionality and operational tradeoffs.

Due to synchronization, disk-binding, and boot-up concerns, persistent data stores are usually in a distinct deployment, not bound together with other microservices.

Coupling and decoupling services

Most platforms have a concept of grouping multiple containers tightly together for a 1:1 state and low latency. In Kubernetes, deployments natively work this way, creating a group called a pod.

Pods are extremely valuable for keeping response times snappy between separate microservices. And as a rough rule of thumb, the microservices for a given logical service should be configured in the same deployment. A given microservice can reside in multiple deployments, so something common, such as an authorization service, can be a “sidecar” in any deployment that needs to make authorization checks.

Sometimes microservices need to be less tightly coupled, typically due to the limitations of a specific service (such as a database), or because the deployment is too unwieldy and resource-hogging to schedule in one unit. When this is the case, you need to break them up into multiple deployments and pay more attention to the requests made.

Synchronous requests are the main contributors of unacceptable service-service latency. This is bad in any tech stack, but the farther apart services get—from a local Unix socket, to a spot behind some network address translation and routing on the machine, to a machine across the warehouse—the more it hurts. In the case of independent requests, use threading asynchronous support to perform requests independently, then continue on with all required data.

Sometimes requests are a back-and-forth process, and cannot be made as parallel calls. This is especially common in databases, where poor code or poor data normalization and denormalization result in having to make successive queries on the path to an eventual update or select. Ultimately, you need to resolve a problem like this in the upstream service, be it the design of a database, the available API endpoints, or something else.

Reproducing for local development

Once you have adopted a platform, you've shifted responsibilities from the host operating system to the container image and the platform itself. To accurately work with a microservice, or with especially high-level services or applications composed of many microservices, you will need to replicate this environment in part.

In extreme cases, developers can adopt a local install of the cloud platform, such as Minikube (a small Kubernetes bundle intended for local testing). Using this local install is certainly a good replica, but it significantly increases the delays and platform-specific knowledge and coupling needed to run and update in-development code. This can be a big point of contention for developers working in scripting languages, where changes can take effect immediately, with no waiting or deploy steps.

An alternative to this that my team uses at Checkfront is Docker Compose. You can use this simple tool, which works similarly to how a pod operates in Kubernetes, to manage a set of containers. It lets you easily join together disparate services and repositories as a unified entity.

Some container runtimes, such as Docker, support mounting a local directory into a running container. This lets developers treat their workspace as a native install; where reloaded changes are immediately applied, you don't need to do image building.

It's about composing your service interactions

Much of cloud-native design boils down to composing services and tailoring service interactions, from high-level service diagrams to low-level APIs and interaction details. The requirements from this come from the way replication and dynamic instancing work in cloud platforms, and traditional requirements around being able to decompose the app for development, testing, avoiding failure cascades, etc. (By "decompose" I mean breaking it into an isolated component. For example, only running the services related to user account functionality, or those required to handle a specific api endpoint.) Many of these details are platform-agnostic, but some details will require platform-specific lock-in or consultation with your IT Ops team.

Vallery will be at KubeCon/CloudNativeCon on May 2-4 in Copenhagen, giving a talk called “Challenges to Writing Cloud Native Applications.” She will dive deeper into the subjects mentioned here, and cover more pain points and designs. TechBeacon readers can receive a 20% registration discount by entering code KCCNEUTB.

Keep learning

Take a deep dive into the state of quality with TechBeacon's Guide. Plus: Download the free World Quality Report 2022-23.
Put performance engineering into practice with these top 10 performance engineering techniques that work.
Find to tools you need with TechBeacon's Buyer's Guide for Selecting Software Test Automation Tools.
Discover best practices for reducing software defects with TechBeacon's Guide.
Take your testing career to the next level. TechBeacon's Careers Topic Center provides expert advice to prepare you for your next move.

Read more articles about: App Dev & Testing, App Dev

You are here