Is there a paved road toward cloud native resiliency?
What does maturity mean in the world of software? Especially in the cloud-native world, where progress is extraordinarily fast, new frameworks and projects appear all the time, and developers, engineers, and leaders are constantly asked to learn the next new thing?
This topic was taken up by the CTO Summit, embedded inside of KubeCon + CloudNativeCon North America 2022, just a stone’s throw from the Detroit River. This tight-knit group of technology leaders comes from some of cloud native’s biggest end users, with a special thank you to event sponsor Uptypcs, which helps with container registry scanning and other security features.
The CTO Summit focused on the intersection between provisioning and the Cloud Native Maturity Model; leaders openly discussed the challenges, cultural shifts, and technical solutions they’ve seen on their journey to improve the maturity and resiliency of their cloud-native infrastructure.
The goal of the CTO Summit was to learn from each other and uncover new insights to share with the entire cloud-native community — which was the spirit of the entire KubeCon conference, as stated by Priyanka Sharma, the executive director of the Cloud Native Computing Foundation, in her opening remarks.
“We’ve started this CTO Summit where end users, like Fidelity and Intuit, and many others can come together and have a private conversation, in a secure space, about how they are handling these challenges in their organizations,” she said. “We all have each other to learn from.”
A fireside chat with the Cloud Native Computing Foundationwww.youtube.com
What is the Cloud Native Maturity Model?
A project from the Cartografos Working Group, with contributing members from Fairwinds, Stackegy, and Accenture, the Cloud Native Maturity Model provides a framework for organizations working with cloud-native software from inception to full adoption of the CNCF Landscape.
The Model includes different criteria for People, Policy, Technology, and Business Outcomes, with tangible goals to meet before an organization — or even a small team — can consider itself mature enough to move on to the next stage. The states are:
- Build: A baseline cloud-native implementation is in place.
- Operate: A cloud-native foundation is established, and you’re moving toward production.
- Scale: A production environment is running, and you’re defining processes for scale.
- Improve: You’re improving security, policy, and governance across the environment.
- Optimize: Decisions in earlier stages are being revisited and improved based on application and infrastructure monitoring.
The Model’s goal is to simplify the scope of cloud-native improvements and standardize the KPIs technical teams should shoot for on their path. When it works, it works remarkably well.
Making progress, but still, challenges
During the CTO Summit panel, Pratik Wadher, senior vice president of product development at Intuit, said, “We started measuring [cloud-native maturity] using development velocity, which we define as what gets shipped to our customers. We measure every release. As we’ve been taking on this journey and moving all our capabilities [to cloud native], we’ve noticed a 6X increase in development velocity.”
Still, CTO Summit participants said other challenges continue to exist.
One participant noted that different applications, even within the same infrastructure or the same cluster, can operate at different maturity levels, and growth wasn’t linear.
Another participant elaborated on the varied cultural blockers that affected each maturity stage and how their team unlocked entirely unexpected levels of technical complexity once they reached the final optimization stage.
In the breakout sessions and subsequent panel discussion, the end-user participants also made the most of the “Chatham House Rule” to openly discuss their challenges and solutions on their unique cloud-native maturity journeys.
Tool choice: A ‘paved road’ from the top down, or a free-for-all?
Each participant has already faced (or will soon face) an important decision: Will they dictate which cloud-native tools their teams build with, or will they let developers and engineers develop with whatever works best for them?
In breakout rooms, participants bonded over the often-negative response to their “paved road” tool kits, which they’ve carefully crafted with long-term development, security, and governance in mind. Those who recently stepped into their role or organization faced pushback and claims of tech stack dictatorship.
But the Log4j vulnerability, which affected nearly every software company at the end of 2021, was a common ground to frame this discussion. One participant with a large cloud-native infrastructure, dozens of applications, and many governance headwinds allowed teams to build with their paved road stack or an artisanal tool kit. When Log4j hit, their infrastructure teams, using the approved tools, patched and rebooted their clusters in 30 minutes. On the other hand, the application teams needed 10 days to fix and improve their security posture in the middle of the holiday season.
It’s a powerful illustration of the importance of bumpers and waypoints on any organization’s journey into cloud native maturity.
Container registries: A ‘simple’ problem with many concerns
When one breakout room transitioned into talking about container registries, one participant quipped that there was “not much” to talk about.
Everyone agreed that carelessly pulling containers from public registries like Docker Hub or Artifact Hub carries too much vulnerability risk. But that participant’s technical solution — a self-hosted Artifactory instance that runs security/policy scans every time their team adds a new container, whether developed in-house or sourced from a public registry like Docker Hub or Artifact Hub — wasn’t exactly received as a solved problem.
A tech leader with lots of cloud-native maturity across their infrastructure warned others about the technical complexity of self-hosting a private container registry service. That organization considers its registry a Tier Zero service of mission-critical importance, which means a lot of thought and effort goes into its maturity stages. It’s surprisingly easy to DDoS your own container registry, and your cluster can’t schedule pods if the service crashes.
Which creates entirely new and not-easily-solved problems all on its own.
These concerns are driving companies to push robust container registry security features like threat detection for the Kubernetes control plane and scanning container images in registries for vulnerabilities, malware, secret keys, and more.
The spirited conversations within this year’s CTO Summit went well beyond paved roads and registries, offering important reminders that every organization’s cloud-native maturity moves at its own pace, both as a whole and its many constituent parts of people, applications, and cultures, with many important lessons to learn and codify among the community.
While the cloud native landscape continues to grow with more tools, projects, and solutions for all these challenges, the CTO Summit’s participants valued community above almost all else — a willingness to share, learn from one another, and mature together. They did so not as competitors, but as peers, on an industrywide odyssey into the vast sea of cloud native.