The insider's guide: 40 must-follow IT ops leaders

Nicole Forsgren Partner, Microsoft Research, Microsoft

There has been much ado lately about the DevOps movement—the software development and delivery methodology that is revolutionizing the way that organizations deliver value as software eats the world. And yet so much attention is heaped onto the traditional rock stars of the movement: the developers who change their coding practices to embrace continuous integration, who understand their code needs to be scalable, who embrace failure and fast feedback...

Don't get me wrong. Every movement needs its rock stars out front, but what about the superheroes that are often overlooked, because they do their work so well they remain in the background, largely unnoticed, while keeping our increasingly complex infrastructures up and running? After all, this movement would just be Dev___ without some visionary, pioneering, enterprising ops people, willing to partner and experiment and improve right along with the devs. Without cooperation from both camps, this work would fail. (Sure, there are a handful of places that can do NoOps, but many would argue those are special cases and not applicable to all organizations, such as those with certain regulatory requirements.)

And so I bring you a list of 40 fantastic and influential IT operations professionals to follow online. Sure, there was another list earlier, and the individuals on that list are more than worthy of recognition, but many of them are not in IT operations. In my mind, this distinction is important, because IT operations professionals should be recognized and celebrated for the work they do—not forgotten as we celebrate the rock stars out front.

It may be worth clarifying exactly what IT operations is, and relatedly, what it means to be on this list. IT operations is the work used to keep infrastructure and services up and running—so we're talking about the people who currently manage, or at some point have managed, infrastructures (including public, private, or hybrid clouds), virtualization, containers, likely including configuration management, networking, and several *aaSs. Anyone who has not managed a serious infrastructure, no matter how smart and influential he or she may be, will not show up on this list. I also only included people who are relatively active online (so you can actually follow them), and people who largely tweet about ops, though many of them drift into DevOps at times.

The timing is also perfect, as this week is the 40th anniversary of the USENIX LISA (Large Installation System Administration) Conference, which is being held in Washington, D.C., this year. (@usenix #lisa15).

In alphabetical order by first name:

1. * * * * *

Your own unsung or invisible IT ops person or team. To quote Nathen Harvey, someone who is connected to innumerable IT operations communities, "Unfortunately, it's very difficult to follow the best IT Ops people on twitter....if they were doing less tweeting, they'd be better IT Ops people." So let's not forget the people keeping our own systems up and running. In fact, let's all mark July 29 on our calendars—it's Sysadmin Day! ("Sysadmin" is the old-fashioned term for IT operations, doncha know.) If they aren't on Twitter, go say hi on occasion. Make sure your latest deploy didn't set something on fire.

2. Adam Jacob

Adam Jacob is co-founder and creator of Chef, one of the major configuration management solutions in the market. Prior to Chef (nee Opscode), he founded HJK Solutions, a consulting practice specializing in automating infrastructure. Lover of metal. (Full disclosure: He is my boss.)

3. Adrian Cockroft

Adrian Cockroft is a technology fellow at Battery Ventures. He was previously responsible for Netflix's migration to its highly scalable platform. Prior to Netflix, he was a founding member of eBay Research Labs and a Sun Microsystems distinguished engineer and chief architect in high-performance computing. He speaks at various conferences around the world and advises startups in IT infrastructure.

4. Anna Shipman

Anna Shipman is technical architect at Government Digital Service, UK. She has strong development background, but cut her teeth in ops by supporting large infrastructure at gov.uk. Blogs at http://www.annashipman.co.uk/.

5. Avleen Vig

Avleen Vig is operations engineer at Etsy. He's a regular speaker at conferences like USENIX LISA and Velocity about operations or culture. Blogs at http://silverwraith.com/blog/.

6. Ben Rockwood

Ben Rockwood. Director of Operations at Chef, previously rocked Operations at Joyent. Gave what is largely considered the definitive DevOps for Ops talk at LISA in 2011. Blogs at http://cuddletech.com/blog. Lover of scotch.

7. Brendan Gregg

Brendan Gregg is currently a computer performance analyst at Netflix, and he also spent some time at Joyent. He does some incredible work with performance, monitoring, and visualization (pro tip: check out his work with flame graphs), and presents on his work regularly at conferences like Velocity and USENIX LISA. Shares cool stuff at http://www.brendangregg.com/

8. Bridget Kromhout

Bridget Kromhout is principal technologist for Cloud Foundry at Pivotal, with prior ops experience (running Docker in production, 'natch) at DramaFever. Co-organizer of Velocity and DevOps Days, and regular speaker at several DevOps conferences and events. Blogs at http://bridgetkromhout.com/. (Currently?) has cool pink streaks in her hair.

9. Carolyn Rowland

Carolyn Rowland is IT ops and dev manager at NIST, with a background in ops. Board member at USENIX. Chair of USENIX LISA in 2012, Co-Chair of USENIX Women in Advanced Computing, with Nicole Forsgren, in 2012 and 2013. Speaks at various conferences about the intersection of technology and management, education, and culture.

10. Caskey Dickson

Caskey Dickson is Azure SRE at Microsoft, and previously was SRE at Google. Known for SRE, operations, and monitoring expertise. Speaks at various conferences about operations and monitoring of large, complex infrastructures.

11. Charity Majors

Charity Majors is production engineering manager at Parse/Facebook. She has experience scaling both ops infrastructures and ops teams. Prior to Parse/FB, she was at Linden Lab, Shopkick, and Cloudmark. She is especially passionate about resilient systems and self-healing architectures.

12. Christopher Webber

Christopher Webber is engineering manager at Chef, with previous large ops experience at Demand Media and UC Riverside. Known in both Puppet and Chef communities. Co-hosts the podcast Ops All the Things. Blogs at http://cwebber.net/.

13. Dave Zwieback

Dave Zwieback is head of engineering at Next Big Sound (Pandora) and CTO at Lotus Outreach. Known for his work with complex, mission-critical systems and teams. Author of Beyond Blame: Learning from Failure and Success.

14. Ernest Mueller

Ernest Mueller is product manager at Idera, with a background in UNIX system administration, programming, and architecture. Experienced with release engineering and web operations, he brings a holistic view to DevOps. Blogs at http://theagileadmin.com/.

15. Gareth Rushgrove

Gareth Rushgrove is a software developer at Puppet Labs with a strong sysadmin background. Previously at Government Digital Service, UK. Curator of DevOps Weekly. Blogs at http://www.morethanseven.net/.

16. James Turnbull

James Turnbull is CTO at Kickstarter and advisor at Docker. Author of seven (yes, seven) technical books, including The Docker Book and a fantastic introductory book on monitoring (The Art of Monitoring). He is on the program committee for OSCON and speaks regularly at several conferences. He blogs at http://www.kartar.net/.

17. Jason Dixon

Jason Dixon is director of integrations at Librato. Previously at Dyn, GitHub, Heroku, and Circonus, he speaks on monitoring and visualizations of monitoring data at conferences like Monitorama. Author of Monitoring with Graphite (O'Reilly Media). Blogs at http://obfuscurity.com/.

18. Jeffrey Snover

Oh, man, where to start with Jeffrey Snover? He's a Microsoft technical fellow. Lead Aarchitect for Enterprise Cloud Group. PowerShell architect and inventor. Prior to Microsoft, he was at Tivoli. Awarded over 40 patents and the 2012 Outstanding Achievement in System Administration award. Speaks at various conferences and events on automation and ops in windows environments. Blogs at http://www.jsnover.com/blog/.

19. Jennifer Davis

Sparkly DevOps princess Jennifer Davis is an engineer at Chef. Previously senior production systems engineer at Yahoo. Speaker at various conferences and events, and involved in meetups. Agile Conference DevOps Track chair. Co-author of Effective DevOps, with Katharine Daniels.

20. John Allspaw

John Allspaw is CTO at Etsy. Previously at Flickr, Friendster, InfoWorld, and others. Expert on heuristics used in troubleshooting highly complex, scalable systems. Gave the seminal talk on DevOps with Paul Hammond at Velocity 2009, titled 10 "Deploys per Day: Dev and Ops Cooperation at Flickr." Author of Web Operations (with Jesse Robbins) and Capacity Planning. Co-chair of Velocity Conferences. Blogs at http://www.kitchensoap.com/.

21. John Willis

John Willis is evangelist at Docker, previously VP of solutions for Socketplane (acquired by Docker) and Enstratius (acquired by Dell), and VP of training for Opscode (Chef). Founder and chief architect of Chain Bridge Systems. Authored six IBM Redbooks on enterprise systems management. Co-host of the DevOps Cafe podcast.

22. Jon Cowie

Jon Cowie is staff operations engineer at Etsy. Author of several tools to automate workflow and visualizations (including Oculus, Jawbone Up to Graphite, and several Chef Knife plugins), and the book Customizing Chef. Keeps track of things online at http://jonliv.es/.

23. Katherine Daniels

Katherine Daniels is web operations engineer at Etsy, where she does super cool things and wrote a book about operations with Jennifer Davis. Often purple-haired. Speaker at various conferences and events, and co-organizer of DevOpsDays NYC 2015. Co-host of The Ship Show podcast. Co-author of Effective DevOps, with Jennifer Davis. Blogs at http://beero.ps/.

24. Kelsey Hightower

Kelsey Hightower is staff developer advocate for Google Cloud Platform. Previously with CoreOS, New Relic, Monsoon Commerce, and Puppet Labs. Known for his work with Puppet, Kubernetes, and other open source systems, he is authoring Kubernetes: Up & Running.

25. Luke Kanies

Luke Kanies is CEO and founder of Puppet Labs, and author of Puppet. Involved in configuration management for years, contributing to several tools in addition to Puppet (such as Cfenging). Speaker at various conferences and events around the world. Board member of the Technology Association of Oregon. Blogs at http://madstop.com/.

26. Leslie Carr

Leslie Carr is taking a well-deserved break. Previously DevOps engineer at Cumulus Networks, Wikimedia Foundation, Twitter, Craigslist. Network engineer who speaks at various conferences about automation. Blogs (blogged?) at https://cumulusnetworks.com/blog/author/leslie-carr/.

27. Mandi Walls

Mandi Walls is an engineer and consultant at Chef. Previously at Admeld, Intent Media, and AOL. Wrote Building a DevOps Culture. Speaker at several conferences and events around the world.

28. Mark Burgess

Mark Burgess is considered by many the father of configuration management. He is also the creator of Cfengine and Promise Theory. Emeritus professor of computing at Oslo University College. Received the 2003 SAGE Professional Contribution Award "For groundbreaking work in systems administration theory and individual contributions to the field." Author of several academic articles, as well as the books In Search of Certainty, Thinking in Promises, and Handbook of Network and System Administration (among others). Blogs at http://markburgess.org/blog.html.

29. Mark Imbriaco

Mark Imbriaco is co-founder and CEO at Operable, Inc. Previously VP of TechOps at Digital Ocean, ops at GitHub (a place without hierarchy or managers), VP of technical operations at LivingSocial, and other ___ operations at SalesForce, Heroku, 37 Signals, AOL, and other companies. The man knows his ops, architectures and resiliency, and even team-building, mentoring, and culture.

30. Matt Simmons

Matt Simmons is Linux system administrator at Spacex, previously at Northwestern University's College of Computer and Information Science. Known for representing the mid-tier crowd, and a known and loved blogger. (Case in point: Selected to blog at NASA for its summer launch.) Chairing USENIX LISA next year. Blogs at http://www.standalone-sysadmin.com/blog/.

31. Michael Nygard

Michael Nygard is VP at Cognitect, the company that makes Datomic and Clojure. Prior to Cognitect, he was at Relevance (merged into Cognitect), N6 Consulting, and Verizon. Author of Release It! Design and Deploy Production-Ready Software and contributor to Beautiful Architecture (O'Reilly Media). Blogs at http://www.michaelnygard.com/.

32. Michael Rembetsy

Michael Rembetsy is VP of technical operations at Etsy, currently on sabbatical. Previously at Corsis Technology and LDD. Speaker at several conferences and events, known for encouraging his staff to speak and blog themselves.

33. Mike Fielder

Mike Fielder is director of technical operations at DataDog. Previously at Magnetic, 10gen, and Wireless Generation. Speaks at conferences, and blogs on Datadog's site here https://www.datadoghq.com/blog/author/michael/. Also blogs at http://www.miketheman.net/.

34. Nigel Kersten

Nigel Kersten is CIO at Puppet Labs. Previously at Google, where he designed and implemented the largest Puppet deployment in the world.

35. Pat Cable

Pat Cable is currently in a research role at Lincoln Secure Resilient Systems and Technologies Group at MIT, with prior experience at MIT with network design, simulation, and management. Current work focuses on visibility of cloud computing and its intersection with resilient system and network design. Involved in USENIX LISA and Velocity communities. Occasionally blogs at http://www.pcable.net/.

36. Pete Cheslock

Honorable mention: Pete Chesbot. Is currently senior director of operations and support at Threat Stack. Previously at Dyn, Sonian, and McGladrey. Speaks and writes about systems, monitoring, and security in ops. Co-host of The Ship Show podcast. Blogs at https://pete.wtf/.

37. Seth Vargo

Seth Vargo is currently a jack of all trades at HashiCorp. Previously at Chef and CustomInk. Part develper, part evangelist, his skills and interests include a solid background in system administration, automation, security, and usability. Wrote Learning Chef. Occasionally blogs at https://sethvargo.com/.

38. Theo Schlossnagle

Theo Schlossnagle is CEO of Circonus and founder of Message Systems, Fontdeck, and OmniTI. On the advisory board for ACM Queue. An expert in distributed systems, monitoring, and data analysis. Currently on sabbatical touring the world. A highly sought-after speaker, his talks about how to analyze, operationalize, and visualize the operations and monitoring space are among the best and often worth multiple views.

39. Thomas Uphill

Thomas Uphill is senior Puppet developer at Wells Fargo, with previous experience running large infrastructures at Costco. A regular speaker and trainer at USENIX LISA on topics like Puppet and Red Hat. Wrote Mastering Puppet. Blogs at http://ramblings.narrabilis.com/.

40. Tom Limoncelli

Tom Limoncelli wrote the book on system administration, literally. First, it was The Practice of System and Network Administration, then The Practice of Cloud Administration. Also wrote Time Management for System Administrators (among others). Currently site reliability engineer at Stack exchange; previously at Google and Bell Labs. Regular speaker at national and international conferences. Blogs at http://everythingsysadmin.com/.

And last but certainly not least ... we get a Plus 1:

41. Taylor Swift, a.k.a. SecuriTay

SecuriTay preaches about security and computers in hilarious and often too-true ways. Oh, how we love her (?), even if we don't want to admit it.

Is there someone I have missed? Please, let us know who else we should be following in the comments section below.

