The essential guide to software containers for application development
Containers are exploding onto the application development scene, especially when it comes to cloud computing. This is largely because portability has been a big chasm in this area, given the proprietary nature of some public clouds, and this technology abstracts applications into virtual containers that can be moved from cloud to cloud.
The architecture of containers is the other major benefit. There's now a standard way to divide applications into distributed objects or containers. Breaking applications up this way offers the ability to place them on different physical and virtual machines, in the cloud or not. This flexibility offers more advantages around workload management and provides the ability to easily make fault-tolerant systems.
Also, with the use of clustering, scheduling, and orchestration technology, developers can ensure that applications that exist inside of containers can scale and are resilient. These tools can manage groups of containers using a well-defined container management layer that provides these capabilities. As the container world continues to emerge, it's becoming difficult to build container applications without these management layers.
Finally, the popularity of containers has led many large companies, such as AWS, HP, IBM, and others to pledge allegiance to them. This provides support directly from existing enterprise tools and technology. Numerous well-funded startups are appearing as well, with innovative solutions to make container development much more interesting and productive.
What does all of this mean to software engineers? To answer this question, here's a guide for leveraging software containers for those charged with application development, focused on what's important.
Docker, the most popular container standard, is an open-source project that provides a way to automate the deployment of applications inside software containers. Docker really started the container movement. However, it's not the only game in town. Companies such as CoreOS have their own container standard called Rocket, and many standards and products are being built around these technologies.
Don't let containers scare you. This kind of approach is nothing new—containers have been used for years as an approach to componentize whole systems, abstracting them from the physical platform, allowing you to move them around from platform to platform (or cloud to cloud).
Let's focus on Docker for now. The Linux kernel, which is in the container, allows for resource isolation (CPU, memory, I/O, network, and so on) and doesn't require starting any virtual machines. Docker extends a common container format called Linux Containers (LXC), with a high-level API that provides a lightweight virtualization solution that runs processes in isolation. Docker also provides namespaces to completely isolate an application's view of the operating environment, including process trees, network, user IDs, and file systems.
The use of this technology is rather exciting, considering it solves an obvious and expansive problem: How to provide true application portability among cloud platforms? While workloads can certainly be placed in virtual machines, the use of containers is a much better approach, and should have a higher chance of success as cloud computing moves from simple to complex architectures.
The ability to provide lightweight platform abstraction within the Docker container, without using virtualization, is much more efficient for creating workload bundles that are transportable between clouds. In many cases, virtualization is just too cumbersome for workload migration. Thus, containers provide a real foundation for moving workloads around within hybrid or multi-cloud environments without having to alter much or any of the application.
Containers have a few basic features and advantages, including the ability to:
- Reduce complexity through container abstractions. Containers don't require dependencies on the application infrastructure. Thus, you don't need a complex native interface to deal with platform services.
- Leverage automation to maximize portability. Automation replaced manual scripting. These days, it's much more difficult to guarantee portability when using automation.
- Provide better security and governance, external to the containers. Security and governance services are platform-specific, not application-specific. Placing security and governance services outside of the container significantly reduces complexity.
- Provide enhanced distributed computing capabilities. This is due to the fact that an application can be divided into many domains, all residing within containers. The portability aspect of containers means they can execute on a number of different cloud platforms. This allows engineers to pick and choose the platforms that they run on, based upon cost and performance efficiencies.
- Provide automation services that leverage policy-based optimization. There needs to be an automation layer that can locate the best platform to execute on, and auto-migrate to that platform. At the same time, it must automatically deal with needed configuration changes.
How to scale container-based applications
Most who look to make containers scale take one of two basic approaches. The first approach is to create a custom system to manage the containers. This means a one-off system that you build to automatically launch new container instances as needed to handle an increasing processing load. But remember that if you build it, you own it. As with many DIY approaches, the maintenance will become labor- and cost-intensive.
The second approach is to leverage one of the container orchestration, scheduling, and clustering technologies that will provide the basic mechanisms to enable scalability. This is normally the better of the two options.
There are a few choices out there for the second approach:
First, Google's Kubernetes is an open-source container cluster manager, much like Docker Swarm (discussed below). Kubernetes can schedule any number of container replicas across a group of node instances. This container replication and distribution trick is typically enough to make most large container-based applications scale as needed. This is pretty much the same approach to scaling containers that the other tools take.
Second, Cloudify provides a Docker orchestration tool that overlaps with Docker Compose and Docker Swarm. Its YAML-based blueprints let developers describe complex topologies, including the infrastructure, middleware tier, and app layers. It's more orchestration-oriented, and thus should be considered when looking at orchestration and automation tools when clustering isn't needed.
Finally, the newest tool, Docker Swarm provides clustering, scheduling, and integration capabilities. This tool enables developers to build and ship multi-container/multi-host distributed applications that include the necessary scaling and management for container-based systems. Obviously, Swarm is designed to compete with Kubernetes, which has a larger market share. Consider both tools when there's a need to massively scale containers. I would suggest a proof of concept with each technology, using real-world workloads.
Best practices continue to emerge around scaling containers, including:
- Devote time to the architecture of your container-based applications. Most scaling issues are traced back to poor designs, not poor technology.
- Always do a proof of concept to determine the real scaling capabilities of the solutions you're considering. Use automated testing tools to simulate the workloads and massive amounts of data for testing.
- Consider your own requirements. What works for other large companies may not be right for your container-based applications.
- Don't forget about security and governance. They have to scale as well.
I suspect that scaling containers will be a bit tricky until more is understood about how containers behave at scale. However, with a good understanding of the proper use of containers and the right technology, you'll be scalable right out of the gate.
Understand the steps
If you're running Linux already, then installing Docker won't be that complex. However, installing Docker on a Mac or Windows will require a few more steps. Just follow the appropriate OS instructions.
The next step is to attempt to run a Dockerized application. Docker has compiled a public registry of applications available as Docker images, and this community provides many jumping off points for building and running your own container-based applications.
Once Docker is installed and online, run a Docker application image by entering:
sudo docker run --rm -p 3000:3000 image_name
There are a few more details, but for simplicity's sake, we'll leave them out of this discussion. Note that the "docker run" command above is running an image called image_name. If it can't find the image on your local system, it will check the public registry and invoke it there, if found.
The Docker container is simply an instance of a Docker image, much like applications are instances of executables that exist in memory. So, you can launch multiple isolated instances of the app as containers on a single host. By adding "-rm" to the command, as done above, Docker is instructed to remove the container from memory once it completes its task. This has the effect of removing any changes to the local environment that the application may have made but keeps the cached image.
Building a Docker image for an application requires starting with a base image for the core OS, which runs in Docker. Install and configure the necessary tools, and then use the Docker "commit" command to save the container as an image. Finally, push it to the public Docker image registry or keep it private.
Another way to create an image is to note the steps required to build the image in a well-formed Dockerfile file. This automates the process of installing and configuring the application, creating a repeatable process.
As with any development process, there are more details that you need to understand to master building and running Docker images and containers. In many respects, the success of containers and Docker has been around the ease of development. As the standard and product progresses, things will likely get even easier.
A container in every shop
The tendency is to think that new ways of building systems will be the way that we build systems for years to come. While that hasn't been the case in the past, it could be the case with containers.
Containers deliver a standard, useful enabling technology and provide a path to application architecture that offers both managed distribution and service orientation. We've been trying to reach this state for years but have yet to succeed.
Perhaps what's most compelling is the portability advantage of containers, and that remains the battle cry of container technology providers these days. However, it'll be years before we really understand the true value of containers, as we move container-based applications from cloud to cloud.
I suspect that if this momentum continues, containers will be a part of most IT shops in the future—whether they're moving to the cloud or not. The viability and versatility of this technology will be something that we continue to explore and exploit over the next several years. Count on the fact that a few mistakes will be made, but the overall impact of containers is a foregone conclusion.