Line of fast cars

Are your microservices optimized for speed?

A microservices architecture lets your teams iterate faster by making changes independently of each other. With each iteration, they try to figure what delivers business value and what does not.  

To be able to do this, you need to optimize your systems (teams, tooling, processes) for speed of change. Watch out, however, for these hidden roadblocks. 

2016 State of DevOps Report

Move your goalposts

Optimizing for speed with microservices is quite different from how IT has traditionally operated. It has historically been responsible for improving the efficiency of existing manual or paper processes—automating things to reduce cost. Speed was a secondary concern. This included enterprise resource planning, accounting, customer relationship management, and others. IT was used as a way to support the business, and was seen as a cost center. Therefore, IT was highly optimized to reduce cost.

Today, competitors use technology to actually drive business value, rather than just as a way to support the delivery of projects. They do this by providing digital services optimized for the customer experience. They also use it to discover what customers value in the first place.  

To know what a customer values, you need to listen to them; you need to have a conversation. Customers get mad when they have to wait. They order things that aren't on the menu. And they expect to talk with a person, not an automated phone system. You need to align services with what customers value, not how your organization is structured. You need to have feedback loops, and as a provider of value, you need to respond quickly, and change your service based on feedback.

That's exactly what a learning system does. It has a purpose, or business value. It comes up with a hypothesis, experiments, and learns as a result of those experiments.

Wikipedia defines a system as "a set of interacting or interdependent component parts forming a complex/intricate whole." Likewise, a microservices architecture consists of a set of connected, interdependent things that work together to evolve and deliver value.

As with any system, the more connected a microservices architecture is, and the more dependencies that exist within it, the more pain you experience when you try to make changes.

Changes to one part of the system may benefit your system in one area, but adversely affect other parts of the system.

But the idea behind microservices is to optimize for speed, so you need to minimize dependencies, and make explicit the ones you have. Some people refer to this as "loose coupling," but there's a bit more to it. Lets explore the hidden dependencies that arise in a microservices system that can slow you down.

How to make your data representations

Services in a system cooperate with each other through communication. Services send messages to each other using either asynchronous or synchronous protocols as part of a series of notifications, commands, or both. Additionally, a service may rely on a persistence mechanism to store its data for later retrieval (in a database, for example). Either way, data need to be marshaled to and from a representation that can exist outside of main memory, and that can be understood by collaborating services.

Explicit coupling is a bit easier to see and understand: If the service moves to a different location or the service goes down, you'll be unable to call it.

Implicit coupling is harder to see. For example, one service might change its data format, and another might not be prepared for the change, but only fail on some requests.

For microservices communication patterns, you should favor some sort of standard serialization technology, not those built into the language you're using. That inevitably leads to a discussion about what technology would be best to use.

For example, do you use JSON or XML? XML has a generally accepted way for structuring data, along with data types, constraints with schemas, and so on. On the other hand, XML is more verbose, more complex. But even an XML document doesn't differentiate a number and a string that happens to be made up of numbers, unless you look at the schema. JSON, on the other hand, doesn't distinguish floating point numbers from integers. Nevertheless, people use both XML and JSON regularly.

Binary formats such as protocol buffers, Apache Thrift, and Apache Avro are also popular choices. Each differs in how it encodes binary representations, how efficient it is, and how it encodes schemas. But regardless of which representation you use, what happens when you try to change a service that requires a data representation change?

Preserve backward and forward compatibility 

The key here, regardless of format technology, is data compatibility. We need to maintain both backward and forward compatibility. Backward compatibility means that newer code, the result of code changes, should be able to distinguish and deal with data formats associated with older versions of the code. For example, going from a data representation of something like this in version 1:

{
  "first": "christian",
  "last": "posta",
  "location": "phoenix, az",
  "twitter": "@christianposta",

to this in version 2:

{
  "full_name_": "christian posta",
  "first": "christian",
  "last": "posta",
  "location": "phoenix, az",
  "twitter": "@christianposta",
  "occupation": "hacker"

or even something like this in version 3:

{
  "full_name_": "christian posta",
  "location": "phoenix, az",
  "twitter": "@christianposta",
  "occupation": "hacker"

With forward compatibility, you try to maintain that an older version of your software will still work, even if it's getting newer versions of your message. Forward compatibility is a bit harder, but with a little discipline, it's doable.

For example, if you are a consumer of version 2 of your message above, you'd be forward compatible with version 2, where both first and last fields are present, so long as you don't over specify what fields should or should not be there. But you would not be forward compatible with version 3, where first and last have been removed.

A basic principle when implementing forward compatibility is to not over specify validation rules or assumptions when using a data format.

The binary formats from above (protocol buffers, Thrift, Avro) present interesting schema evolution capabilities. For example, Thrift and protocol buffers let you change field names and still maintain compatibility, since field names are not encoded in the message (field positions are encoded via numeric aliases). On the other hand, you cannot add or remove fields that are marked as required without breaking backward compatibility.

With Avro, you pass along both the writer and reader schemas so that it can figure out forward and backward compatibility for you. You can actually specify default values for your optional fields, which also helps with forward/backward compatibility, even when removing fields.

Use contracts and validation

Maintaining forward and backward compatibility goes a long way toward reducing some of the pain you see when evolving services, but this pain is still hidden. For example, if the maintainers of services make changes to those services, you just have to hope that they maintain backward and forward compatibility. From the service provider's perspective, you're not sure what the backward/forward implications are for the changes you make. You need another way to make these types of dependencies explicit.

What if there was a way for consumers to tell producers that they're interested in only a subset of the messages or a subset of the body of a particular message? Maybe a consumer is only interested in the first and last fields of a message, and ignores everything else.

If that's possible, could the consumer use that to help drive the direction of the provider's contract? They could make more explicit the dependency they have on subsets of the contract. That would be a better approach than hoping your code is forward compatible and that the provider maintains backward compatibility.

Ian Robinson wrote a brilliant description of this approach, which he called consumer-driven contracts. This is a powerful way for consumers to declaratively publish the parts of the contract that are actually used, rather than just  publishing everything.

In many cases, the consumer may find value in just a few fields from the response. For example, if you consume the message from above, and you really only care about the first and last fields, then you can specify that in a declaration of some sort, such as this pseudo declaration:

response {
  body([
    first: 'christian'
    last: 'posta'
  ])
}

If you have a way to provide this declaration to the provider of the service, then they have more insight into which part of the contract provides value to consumers — and therefore where to be more careful when making changes.

As the provider of a service, it can be time consuming and burdensome to try to maintain documents about what consumers care about, and to try to associate those documents with versions of their contracts and then manually test to make sure the changes don't break those contracts.

You can use tools that help automate this process, so that the formal specifications of a consumer's interests can be converted to unit tests and folded into the provider's code base. From this, the provider can also generate and release stub providers that pass the unit tests as well. In this way, consumers don't have to reinvent mock/stubs when developing locally, but can rely on the provider-generated stubs instead.

The Pact Foundation, the pact broker, and Spring Cloud Contract all provide good solutions for this type of consumer-driven contract development.

Manage changes that affect promises

Ultimately, as a service provider, you state your promise to provide a service, maintain compatibility, and even allow consumers to help drive your contract. You may also interact, cooperate, and collaborate with other services that also state their promise to provide a service.

Our service and the collaborator services, however, are autonomous from each other, and should be able to change at their own cadence. That's the whole point of moving to a microservices architecture. For example, if my team provides a recommendation service to our website, we may depend on other services, such as a product service or sales service, in order to fulfill our promise. What we've offered to our consumers is the promise of providing recommendations of what products may go along with others that may be in the customer's shopping cart.

The thing about promises, however, is that there are times when they cannot be kept. If the product service or sales service is not available for some reason, they've broken their promise. But, as the owners of the recommendation service, you still have an obligation to provide recommendations, even though your dependencies are not available.

You'd like to be able to continue, even when your dependencies fail you, so you need to find alternative ways to fulfill the request. You only want to break your promise as a last resort. Things like circuit breaker patterns, or alternative-path patterns, help keep those promises.

In an environment where things change fast, and can potentially break or become unavailable, you need to build your services with the premise that you are responsible, even if your dependencies fail. For more on Promise Theory, read Mark Burgess's book, "In Search of Certainty," which offers a deeper treatment on promises in distributed systems.

Stay focused: Optimize for speed

A microservices architecture will bring value only if you've focused on the right thing: optimizing for speed. But not everything in an organization needs to be optimized.

A microservices architecture is an evolving system you build to help you figure out and deliver business value through experiments, iteration, and feedback. In so doing, you must be mindful of explicit dependencies, as well as implicit dependencies that can make it difficult to make changes to your system. But if you build your systems with the full knowledge that uncertainty exists, and make that assumption right up front, you're much more likely to be successful.

2016 State of DevOps Report
Topics: App DevQuality