Best practices for scaling with DevOps and microservices

Wix is a highly successful cloud-based web development platform that has scaled rapidly. We now support more than 70 million sites in more than 190 countries, and we're adding customer sites at a rate of about 1.5 million a month. This translates into approximately 2 petabytes of user media files and adds about 1.5 terabytes per day. So how did we get there? The 400-strong Wix engineering team used a microservices architecture and MySQL, MongoDB, and Cassandra. We host our platform in three data centers, as well as on the cloud using Amazon Web Services (AWS) and the Google Cloud Platform.

I’ve been working with Wix on and off since its founding in 2006, and full time since 2010. In this time, I oversaw the engineering team’s transition from a traditional waterfall development-based approach to agile methodologies, helped introduce DevOps, and initiated the Wix SDK. Here's what I learned about using microservices and MySQL to effectively support a fast-scaling environment.

How Wix defines microservices

Wix currently has around 200 microservices, but we didn't start out on this path. During our days supporting 1 million sites back in 2008, we used a single-monolith approach with Java, Hibernate, Ehcache, Tomcat, and MySQL. This typical scale-up approach was useful in some aspects, but ultimately, we couldn’t tolerate the downtime caused by poor code quality and interdependencies inside the monolith. So, we gradually moved to a service-level-driven architecture and broke down our monolith.

By our definition, —(a single team can manage a few microservices), and the team must be able to describe each microservice’s responsibility in one clear sentence.

Specifically, a microservice is a single application deployed as a process, with one clear responsibility. It does not have to be a single function or even a single class. Each microservice writes only to its own database, to keep things clean and simple. The microservice itself has to be stateless to support frequent deployments and multiple instances, and all persistent states are stored in the database.

Wix’s four sets of microservices

Our architecture involves four main groups of services:

Wix Editor: This set of microservices supports creating a website. The editor is written in JavaScript and runs in a browser. It saves a JSON representation of the site to one of the editor services, which in turn stores the JSON in MySQL and then into the Wix Media Platform (WixMP) file system. The editor back-end services also use the Jetty/Spring/Scala stack.
Wix Public: This set of microservices is responsible for hosting and serving published Wix sites. It uses mostly MySQL and Jetty/Spring/Scala applications to serve the HTML of a site from the data that the Editor has created. Wix sites are rendered on a browser from JSON using JavaScript (React), or on the Wix Public server for bots.
WixMP: This is an Internet media file system that was built and optimized for hosting and delivering images, video, music, and plain files, integrated with CDNs, SSL, etc. The platform runs on AWS and the Google Cloud Platform, using cloud compute instances and storage for on-the-fly image manipulation and video transcoding. We developed the compute instances software using Python, Go, and C, where applicable.
Verticals: This is a set of applications that adds value to a Wix site, such as eCommerce, Shoutout, or Hotels. The verticals are built using an Angular front end and the Jetty/Spring/Scala stack for the back end. We selected Angular over React for verticals because Angular provides a more complete application framework, including dependency injection and service abstraction.

Why MySQL is a great NoSQL

Our microservices use MySQL, so scaling them involves scaling MySQL. We don’t subscribe to the opinion, prevalent in our industry, that a relational database can't perform as well as a NoSQL database. In our experience, engineers who make that assumption often ignore the operational costs, and don’t always think through production scenarios, uptimes, existing support options, knowledge base maintenance, and more.

We’ve found that, in most cases, we don’t need a NoSQL database, and that MySQL is a great NoSQL database if used appropriately. Relational databases have been around for more than 40 years, and there is a vast and easily accessible body of knowledge on how to use and maintain them. We usually default to using a MySQL database, and use NoSQL only in the cases where there’s a significantly better solution to the problem, such as if we need a document store or a solution for a high data volume that MySQL cannot handle.

Scaling MySQL to support explosive growth

Using MySQL in a large-scale system can present performance challenges. Here is a top 5 list of things we do to get great performance from MySQL:

Whenever possible, we avoid database-level transactions, since they require databases to maintain locks, which in turn have an adverse effect on performance. Instead, we use logical, application-level transactions, which reduce loads and extract better performance from the databases.
We do not use sequential primary keys because they introduce locks. Instead, we prefer client-generated keys, such as UUIDs. Also, when you have master-master replication, auto-increment causes conflicts, so you have to create key ranges for each instance.
We do not have queries with joins, and only look up or query by primary key or index. Any field that is not indexed has no right to exist. Instead, we fold such fields into a single text field (JSON is a good choice).
We often use MySQL simply as a key-value store. We store a JSON object in one of the columns, which allows us to extend the schema without making database schema changes. Accessing MySQL by primary key is extremely fast, and we get sub-millisecond read time by primary key, which is excellent for most uses. MySQL is a great NoSQL that’s ACID compliant.
We are not big fans of sharding because it creates operational overhead in maintaining and replicating clusters inside and across data centers. In terms of database size, we’ve found that a single MySQL instance can work perfectly well with hundreds of millions of records. Having a microservices architecture helps, as it naturally splits the data into multiple databases for each microservice. When the data grows beyond a single instance capacity, we either choose to switch to a NoSQL database that can scale out (Cassandra is our default choice), or try to scale up the database and have no more than two shards.

Takeaways

It's entirely possible to manage a fast-growing, scale-out architecture without being a cloud-native, two-year-old startup. It's also possible to do this while combining microservices with relational databases. Taking a long, hard look at both the development and operational pros and cons of tooling options has served us well in creating our own story and in managing a best-in-class, SLA-oriented architecture that drives our business growth.

Keep learning

Take a deep dive into the state of quality with TechBeacon's Guide. Plus: Download the free World Quality Report 2022-23.
Put performance engineering into practice with these top 10 performance engineering techniques that work.
Find to tools you need with TechBeacon's Buyer's Guide for Selecting Software Test Automation Tools.
Discover best practices for reducing software defects with TechBeacon's Guide.
Take your testing career to the next level. TechBeacon's Careers Topic Center provides expert advice to prepare you for your next move.

Read more articles about: App Dev & Testing, App Dev

You are here