With the task of software development becoming increasingly complex, it may be time for dev and ops specialists to part ways again. But is it possible without repeating the mistakes of the past?
Devops emerged hand-in-hand with the emergence of agile methodologies and cloud computing in the late 2000s, when software started eating the world. A nice portmanteau of “development” and “operations,” devops attempted to bring together the two previously separate groups responsible for building and deploying software. It also coincided with, or was even unintentionally brought forward, the need for software engineers to tighten their user feedback loops and push updates to production more often.
While many organizations seized this opportunity to bring together two sets of specialists to solve common problems at previously impossible speeds, others took the emergence of devops as a license for developers to take responsibility for operational tasks and attempted a super team of semi-mythical full stack developers.
“For the most part, developers don’t want to concern themselves with operational concerns,” tweeted Devops For Dummies author and head of community engagement at Amazon Web Services, Emily Freeman.
Freeman clearly struck a chord, with hundreds of comments pouring in from developers who didn’t want to do ops either.
“I’m a developer and I don’t want to deal with operational concerns,” replied Scott Pantall, a software engineer at the fast food company Chipotle.
“Devs and ops should work closely with differentiated roles. The empathy between teams is the real point,” commented Andrew Gracey, a developer evangelist at SUSE.
While the concept of moving more operational and security issues to the ‘left’ and into the domain of software developers clearly has its advantages, it also has the potential to create a dangerous bottleneck.
“If you pull developers into too many different areas, you’re shooting yourself in the foot. It’s different skills,” James Brown, lead product for Kubernetes storage specialist Ondat, told InfoWorld.
Or as Nick Durkin, field CTO at Harness, put it, “People are starting to realize we wouldn’t hire an electrician to do our plumbing.”
A ‘huge’ increase in responsibilities
While the number of enterprise software developers has never been greater, the specialized expertise of technical operations has faded somewhat into the background, even as their workload has increased.
As devops engineer and former systems administrator Mathew Duggan wrote last year, while operators “still had all the responsibilities we had before, to make sure the application was available, monitored, secure and compliant,” they were also tasked with software delivery. to build and maintain. pipelines, “lay the groundwork for enabling development to release code quickly and securely without our involvement.”
These growing responsibilities brought about a massive retraining effort, with cloud engineering and infrastructure as coding skills becoming paramount.
“In my opinion, the situation has never been so bleak,” Duggan wrote. “Development has been completely overwhelmed with a massive increase in the scope of their responsibilities (RIP QA), but also with unrealistic management expectations regarding speed.”
That pressure may be starting to tell.
“It’s incredibly challenging to build an organization that achieves this level of iterative harmony that lasts for a sustained period of time,” Tyler Jewell, managing director at Dell Technologies Capital, wrote in a research note. “As systems become more complex and end-user feedback increases, it becomes increasingly difficult for a human to reason about the impact a change could have on the system.”
Recognizing the problem
The situation may not be as hopeless as Duggan and others think, although it may require a significant reshuffling of technical teams and their responsibilities.
“It’s not about burdening the developer, it’s about providing developers with the right information at the right time,” Harness’s Durkin said. “They don’t want to configure everything, but they do want the information from those systems at the right time so that operations, security and infrastructure teams can run smoothly. Developers shouldn’t care unless something breaks.”
Nigel Simpson, ex-director of corporate technology strategy at the Walt Disney Company, wants companies to “recognize this problem and work on it to make sure developers don’t have to worry about how the machines work — and go back to building software, that’s what they do best.”
It is important to remember that devops is a continuum and its implementation will vary from organization to organization. Just because developers can do some ops now doesn’t mean they should always.
“Developer control over infrastructure is not an all-or-nothing proposition,” wrote Gartner analyst Lydia Leong. “Responsibility can be split across the application lifecycle so you can take advantage of ‘you build it, you run it’ without necessarily jumping your developers into an untamed and uncharted wilderness and wishing them luck in survival because it’s ‘no infrastructure and operations team problem’ anymore.”
In other words, “It’s fine to give your developers full self-service access to development and test environments, and the ability to build infrastructure as production code templates, without making them fully responsible for production,” Leong wrote.
As Brown van Ondat sees it, container orchestration with Kubernetes emerges as the layer between these two teams, separating concerns so developers can focus on their code, and operations can ensure the underlying infrastructure and pipelines are optimized to run it. to feed. “Let’s not rewind to those teams that don’t talk to each other,” Brown said.
According to VMware’s “State of Kubernetes in 2022” report, 54% of 776 respondents said better developer efficiency was a major reason for adopting Kubernetes, and more than a third (37%) said they wanted operator efficiency. improve .
“Don’t fall into the misconception of trying to make everyone an expert,” Kaspar von Grunberg, founder of Humanitec, wrote in his email newsletter. “In high-performing teams, there are few high-profile experts on Kubernetes, and there is a high level of abstraction to keep the cognitive load on everyone else low.”
Devops is dead
If the era of devops is indeed coming to an end, or even if the shine is just starting to wane, what’s next?
Site reliability engineering (SRE), which grew out of Google as it suffered from its own devops-related growing pains, has proven to be a popular solution.
“Basically, this is what happens when you ask a software engineer to design an operations function,” Ben Treynor, vice president of engineering at Google and the godfather of SRE, often says.
Take the two major financial institutions, Vanguard and Morgan Stanley, which have found it difficult to balance dev and ops responsibilities as they move to more cloud-native practices.
By implementing an SRE security blanket at both the central operational level and within individual developer teams, both companies have built the confidence that they will strike the right balance between developer speed and operational stability.
However, the SRE feature has also been criticized. Establishing SRE principles is “sometimes misunderstood as an ops team rebrand,” noted Trevor Brosnan, head of devops and enterprise technology architecture at Morgan Stanley.
“It’s a nuanced problem to solve,” said Christina Yakomin, a site reliability engineer at Vanguard. “Introducing SRE makes people feel like we’re locking operations back into that role.” Instead, Yakomin wants to encourage Vanguard developers and operations specialists to share responsibility for security and ensure that teams with shared platforms take full operational responsibility for them.
Long live platform technology
The idea of the in-house developer platform, or the discipline of platform engineering, has also emerged as a way for organizations to give developers the tools they need, complete with the right organizational guardrails to enable developers to do their best work. to do.
An in-house developer platform typically consists of the APIs, tools, services, knowledge, and support developers need to get their code into production, combined into an enterprise-standard platform maintained by a dedicated team of specialists or product owners.
“Devops is dead, long live platform engineering”, tweeted software engineer and devops commentator Sid Palas. “Developers don’t like infrastructure, companies need control over their infrastructure as they grow. Platform engineering allows these two facts to coexist.”
Brandon Byars, chief of technology at software consulting firm Thoughtworks, says he often “sees that division working well in platform engineering teams, looking to take the friction out for developers while giving them the buttons to turn.” However, he adds, “Where it doesn’t work well is asking developers to do all that work without centralized expertise and tooling support.”
The balancing act between software development and operations teams will be familiar to any organization that has worked to implement devops principles in its engineering teams. It’s also a balancing act that is becoming increasingly complex in the era of cloud-native complexity.
Copyright © 2022 IDG Communications, Inc.