You are here

You are here

What dev and app sec teams can learn from Dropbox's move to Python 3

Robert Lemos Freelance writer

In 2015, Dropbox embarked on a massive project to migrate its software infrastructure and development teams from Python 2 to the more modern Python 3. While Python's core developers had announced their intention to sunset Python 2 seven years ago, for Dropbox, it was only in 2015 that the payoff of converting all its software to the latest version of Python made sense.

Python 3 offered significant improvements, including fewer inefficiencies compared to Python 2, and the older version of Python had compilers and tool chains that had become obsolete. In addition, after January, Python 2 would become a security risk, with newly reported vulnerabilities no longer fixed by the maintainers.

The benefits of moving to Python 3 simply seemed right at that point, said Damien DeVille, a staff engineer at Dropbox, who worked on the Python 2 migration.

"Now, with every major version that comes out, we are brought up to date, so we get security and safety automatically—that, right there, is worth it. But it's also a huge boost in productivity. Every time there is a new version, it's an easier process than in the past to upgrade our infrastructure."
Damien DeVille

Other companies are facing the same decision: After nearly two decades, Python 2 is officially unsupported as of January 1, 2020. Third parties have pledged to provide support, for a fee, but leaving Python 2 code in production will be a greater risk as time goes on and new vulnerabilities come to light. Those issues won't be patched by the Python's maintainers, and most popular packages won't provide patches either.

If your company is facing Python 2's end of life, or any other major migration, here are key lessons from Dropbox’s effort to port its code to Python 3.

1. Upgrade when it makes sense for your shop

The calculus for a massive migration to a new platform can be complex.

For example, in 2008, Python's core development team announced they would phase out what was at that time an eight-year-old implementation of Python 2 in favor of the latest major version of the language.

Yet major frameworks and libraries were slow to migrate their code, and without the commitment of those projects, moving to the latest version of Python did not make sense for many companies, especially because Python 3 is not source-code compatible with the previous version.

Python's core developers initially set the deadline for 2015, but later moved the end of life to Jan. 1, 2020.

"The main reason for delay is simply that it took quite a while for the relative benefits of the Python 3 series to become sufficiently compelling to justify the switching costs," said Nick Coghlan, inaugural Python Steering Council member and author of the Python 3 Q&A. "While for some projects, the switching costs were associated with lots of minor changes, one common major cost was updating their code to support the new text model in Python 3."

This time, there is no turning back. While Dropbox's migration is a fait accompli at this point, the benefits of migration were not initially a slam dunk, said Max Bélanger, a software engineer with Dropbox and author of a recent blog post on the company's migration.

"You had this weird decision to make between the existing stable language, where everything worked, and a new version of the language where most things didn't work, where probably things would run slower, and, oh yes, you had to relearn how to do some things because of the newer syntax," Bélanger said.

"It was not a no-brainer in the early days."
Max Bélanger

It still isn't clear-cut for some. About 13% of all Python developers continue to use Python 2 as their go-to language for development.

2. Get ready to remove dependencies

Not all of Dropbox's projects have moved to Python 3, and that means that some libraries and components that have not made the shift need to be replaced or jettisoned, said Bélanger.

A key consideration in migrating platforms is support among the most critical libraries for the new language or framework. Conversely, a Python 2 library that's no longer  maintained presents a security risk that necessitates upgrading to Python3.

"As you consider moving a project to [Python] 3, you get a critical mass, where enough dependencies have upgraded that it now makes sense. But get ready to remove dependencies that are not upgrade-compatible."
—Max Bélanger

We removed a bunch of dependencies in our case, and most teams will have a mix of upgrading and removing dependencies, if they want to move quickly, he added.

3. Write code to work on both old and new versions

While Dropbox modified its Python code for a particular use case, the company did not use proprietary code in its migration. Instead, it wrote the code to work with both Python 2 and Python 3.

"The key thing is [working] incrementally and not forking; that never works," said Bélanger. "We used a syntax model where you could run both and, when you are ready, you can flip a switch and move everything to Python 3."

A key part of the process is finding every instance of code running on the old platform. For Python 2—and many other languages and frameworks—that can be tackled in a standard way, said Python developer Coghlan.

"For finding internal projects that are unexpectedly based on Python 2, I recommend treating it like any other 'How do we know what open-source dependencies we're relying on?' question, and exploring service providers in the software composition-analysis space," he said.

4. Use migration tools

Dropbox used several migration tools to make its code compatible, including Python Six, a migration library that translates the source code to work on either platform. Such an approach has become more popular, Python's Coghlan said.

"One particularly notable area of activity has been the significant growth in static analysis of Python code, where one of the major use cases has been to analyze Python 2 codebases as part of getting them ready for migration to Python 3."
—Nick Coghlan

This is exactly what Dropbox did, using typing and other static code techniques to allow the development environment to flag problems before they made it to the compiler, said Dropbox's DeVille.

"Before even trying to run the code, we would have a place to compile it and find obvious issues. That was a huge boost for productivity for us, because we could go through our code and compile it and catch errors before they run."
—Damien DeVille

5. If all else fails, break glass ...

Developers who are truly facing down an end-of-support deadline and who don't have a handle on whether they have excised all old code from their environment can usually find third-party support. ActiveState, for example, is supporting some of the most popular Python 2 frameworks and will backport Python 3 fixes to the older platform, said Jeff Rouse, vice president of product at the firm.

"We recognize you cannot just have a gap in your ability to service your applications. We are trying to make sure that there is a good transition for organizations."
Jeff Rouse

Weigh up the risk

In the end, you must balance the efficiencies of remaining on a the well-understood platform of Python 2 against the security and performance benefits of moving your code to Python 3. Yet, as more and more popular libraries and open-source frameworks end their support for Python 2, the clock is ticking as to how long you'll have the choice to remain with Python 2.

Note: The Python Software Foundation does not recommend specific solutions, but it does have its own guide with Python-specific advice, The Conservative Python 3 Porting Guide.

Keep learning