Microservices quality issues? A modern DevOps approach can help
Your team has followed industry trends and shifted from a monolithic system to a widely distributed, scalable, and highly available microservices architecture. But instead of solving all the problems with your fragile legacy architecture, you ended up with a set of federated services that have hidden dependencies maintained by teams that don't talk to each other.
You're stuck, unable to figure out which versions work together, and that drives a need to test your still-monolithic system in pieces and as a whole.
Sound familiar? Successful microservices require a combination of technical architecture, automation, testing, and development methodologies that all relate closely to agile and DevOps. Without the right mix, things rarely go as planned.
You may have a wide set of independently changing applications that have some sort of loosely coupled, not-really-specified, and still-present dependencies. The teams that maintain them hide behind poorly defined service APIs and embrace the freedom to move at their own pace.
You want to have a set of independent services where you can deploy small changes to production rapidly. Instead, you can't figure out which versions of the services work with each other, even in your test environments.
You need to bring the teams closer, continuously integrate the services, deploy those integrated services into various environments, and frequently test your still-monolithic system both in pieces and as a whole. This is a DevOps problem, which means your success is squarely anchored in a DevOps pipeline with a high degree of automated testing.
Here's what you need to know to tackle your microservices quality issues.
DevOps and microservices
One of my favorite quotes came from a smart person at a conference a few years ago. "If you are building microservices and you don't have a highly automated continuous integration/continuous delivery (CI/CD) pipeline in place, you've already lost," he said.
I often follow that up with my humble corollary: "Your efforts in DevOps and microservices will fail without proper automated testing and assessment that is fully integrated into the pipeline." Basically, you'll lose anyway.
Consider the example system in Figure 1. A few years ago, I worked on a 100-person team building a half-dozen independent services. They all had a web UI, an API layer, and a database. They worked together as a full service through a global "page service" central UI, and a few of them interacted with other remote services.
The teams were very good about building their services, sticking to a few standards we put in place, and making their code work—in isolation. We frequently heard "it works on my machine," and teams would then push code up to the Git server, where it would deploy to a test environment—sort of.
In reality, most of the time it would fail to deploy, or crash, or otherwise blow up the central test environment.
The teams had some great unit tests that proved that the code worked the way they wrote it, with 70% or more of code coverage. What they were not good at was proving that it worked the way others wanted it to work. We learned over time that changing a dozen independently deployable application services on a frequent basis required us to change the way we looked at testing across the entire system and find ways to integrate those tests into our process.
Use automation to follow best practices of CI/CD when building and assembling microservices. Consider the pipeline below:
Figure 2: Example of a CI/CD pipeline, with samples of testing tools and testing activities. Source: Coveros
In a DevOps pipeline, it's best to divide things into two primary segments, CI and CD. (Note that, for the scope of this discussion, I'm ignoring continuous monitoring, which is also important to ensure your system continues to run after you've deployed it.)
This is responsible for producing a well packaged, high-quality, deployable set of artifacts that are candidates for production deployment. This is where you build the software, run unit tests, scan it for defects, package it up for installation, and potentially even deploy/run it in temporary environments to prove that it is ready to move to the next phase.
This process oversees installing, configuring, and testing your deployable assets in a way that proves they are, in fact, good candidates for production. In a continuous deployment world, you would fully automate this to the point where passing all the tests results in actual deployment to the production environment without human intervention.
Quality assessment in DevOps
Throughout the pipeline in Figure 2 above, there are tremendous opportunities to scan, test, poke, prod, and otherwise examine your microservices, both independently and together. During CI, you can often perform this initial set of activities on isolated code:
Perform unit and integration tests during the build phase using mocks and other simulated dependencies.
Scan the code using static analysis to look for common coding defects, standards compliance, security vulnerabilities, and general code hygiene and maintainability issues.
Analyze the open-source and other dependencies you use to make sure you are not pulling in code vulnerabilities, license limitations, or other known defects.
As you move to the CD stage, there are more activities that can validate the software on its way to production:
Provision the virtual machines, containers, or other compute resources you need to run the code, then scan those for known vulnerabilities, configuration problems, or other compliance issues.
Deploy your new services, then perform basic smoke test or deployment tests to validate that your deployment automation code works and does the right thing to bring the services up to a "live" state.
Perform some manual, resource-intensive, or otherwise "human-driven" tests on the newly changed code. This could be exploratory testing, user acceptance testing, or other tests that are aimed at inspecting new features and changes before they flow further down the pipeline and cause problems for other services or teams.
Execute automated functional tests aimed at validating that the old code still works properly and the new code works as intended. Require test automation to be written at this stage for all new changes so it can become the "new" regression tests for future changes. These are done at various levels of the system: API-level service testing, UI-level user behavior, and even cross-service, end-to-end tests.
Run nonfunctional tests, including performance and security testing. A small batch of performance tests can keep a pulse on changes that might negatively affect performance in unexpected ways. You can run light-duty security scans on the software to make sure developers haven't inserted any egregious errors in the newly changed code.
As you move through various environments or test stages, the scope and depth of these functional and nonfunctional tests can be increased as the software progresses through increased levels of maturity.
Testing in a world of microservices
The goal in both DevOps and microservices is to keep the software in a continuous working state. You want to establish confidence in change, both within and across services. Force your teams to focus and build quality in, while agreeing that it's important.
(Important caveat: Anytime I say "quality," I really mean "quality, security, performance, maintainability, and any other -ility you can think of.")
In non-ideal microservices, your microservice API interfaces will be poorly documented, change in unpredictable ways over time, and be poorly tested. Corollary: Your teams will neither admit nor communicate this effectively and may not even be aware of it. The solution: Test the heck out of all of the interfaces.
You need to decide whether to deploy and test your applications into a shared environment or into independent environments. The pipelines for your microservices will each run independently and potentially in parallel as the teams make changes to the code. After the CI process vets those changes, you need to deploy them somewhere to perform functional and nonfunctional testing.
If you deploy these into a shared QA test environment, you run the risk of interrupting other tests and/or confusing the problem with multiple changes occurring at the same time. Alternately, you could have dedicated test environments for each service. The problem here is you run the risk of those environments drifting apart with disparate versions of different services.
Figure 3: Independent pipelines deployed into shared or independent environments. Source: Coveros
The ideal solution is to do a little of both, which is where dynamic on-demand environments come into play. I cannot stress enough how valuable it is to launch a clean environment by replicating production versions of some (or all) of your services, then injecting your newly changed service into that for testing. This allows developers, QA testers, and ops people to run all kinds of validation whenever they need to do so.
Figure 4: Dynamic on-demand environments enable wide varieties of testing. Source: Coveros
Once you get your test environments set up, your next dilemma is which services to test, and when. Trying to run end-to-end tests everywhere all the time is an overwhelming task and will bog down testing. With microservices, you are often concerned with enforcing interface contracts among the various services across the system.
When services change, you need to focus on three key paths:
Test in isolation to verify that your service works according to the interface contract and behavior you think is correct. If you can do this by mocking your dependencies, you can easily debug and isolate any problems you encounter early in your change cycle.
Test with downstream dependencies (e.g., database, other services) to ensure your services work with your dependencies and in line with how you expect other contracts to behave. This is reasonably straightforward to test because you control the execution flow by driving your own service.
Test with upstream dependencies that drive your service (e.g., UI or other services). This verifies how other services think your contract is supposed to behave.
Figure 5: Testing with upstream and downstream dependencies. Source: Coveros
Upstream testing is arguably the most important in the context of the overall system, because it ensures that changes to your internal behavior still work the way the system expects it to. This is where subtle, unspecified dependencies among services might crop up. It's also the most challenging because Team A might not necessarily know how to drive Team W's service properly to execute the right integrations.
This is where test containers can solve important problems. Consider Figure 6, where the test candidate is "record-mgmt." There is another system, "case-management," that uses record-mgmt. The goal is to drive the case-management system to execute some integration tests against our record-mgmt service.
This project achieved the goal by publishing a "test-exec" container for each service. Each team produced a Docker container that included a bundled-up set of Postman tests that triggered one or more test suites against the service's API.
This made it easy for the record-mgmt team to execute the test-exec container for case-management to run a set of integration tests against record-management, thus allowing easy upstream testing when they made changes.
Making it work in a multi-tenant environment
As you might imagine, all of the complications of developing microservices can be exacerbated by having a disparate set of teams all working independently and at different paces. In the world of agile software development, you often have cross-functional teams with a broad set of skills focused on developing a specific microservice. When left alone, these teams will forge off in different directions, potentially at the expense of other teams and the overall system.
It's beneficial to build cross-team functions to support standardization and provide guidance to the individual specialties on cross-functional development teams.
Figure 7: Cross-team functions augment cross-functional teams. Source: Coveros
In the figure above, there are specialty horizontal teams for testing, DevOps, and security. In practice, the testing team had a test lead who was responsible for producing standards, libraries, and tools that were used across all the agile development leads. She had a couple of test engineers responsible for maintaining the tools, code libraries, etc. to support this.
Each agile team had one or more dedicated test engineers focused on solving that particular team's problems while consuming the standards, tools, and so on.
Communication is the core
All of this barely scratches the surface of the complications presented by modern software development and the use of microservices, but one thing is clear: Communication among services is at the core of everything.
Your services need to communicate with one another, and your teams need to do the same. With microservice implementations, test the heck out of them to make sure that communication works as expected.
Want to know more? Richard will give a short session on "A DevOps approach to ensure quality in microservices" as part of the STARWEST virtual conference, which runs October 3-8, 2021. Grab a free pass here. He'll also be teaching a more in-depth training about microservices and containerization later this fall. See dates and details on this dedicated training website.