top of page

The Essence of Platform Engineering

  • omar s
  • Oct 10, 2024
  • 5 min read

Introduction

Platform engineering is a term that many people misunderstand. A lot of folks can't tell the difference between a platform engineer and a DevOps engineer. After spending more than eight years building platforms for enterprises and startups, I think I've developed a natural understanding of what a platform should be. Let me share my simple definition of a platform.

Most of us have heard the expression, "You can't have your cake and eat it too." Well, that isn't always true. For me, having a platform means you get to have the cake and still eat it.


The Journey to Platform Engineering

Let me elaborate by taking you through some recent history. Before the DevOps movement, companies had dedicated teams managing networking, infrastructure (often in data centers), and occasionally security. These teams operated in silos, each working through a ticketing system to manually fulfill requests from development teams. This setup led to friction, delays, and even paralysis.


The Rise of the Cloud

Then came the cloud, with all its self-service capabilities. Developers felt liberated—no longer needing these other teams. With management's blessing (under the guise of moving faster), each dev team began clicking their way out of traditional infrastructure into the vast and colorful world of the cloud. "Clicking their way out" is quite literal, as most teams used the cloud UI to create their infrastructure without records, version control, cost control, or access control—essentially without any control. In many cases, an engineer could spin up a VM and start mining in a different cloud region, and no one would ever realize. This lack of governance led to enormous financial costs, security breaches, and a completely unsupportable, fragmented infrastructure across organizations.

This is where they had the cake but lost it. The "cake" was centralized infrastructure governance, and what was lost included security, cost control, cohesion, and sanity.


Enter Kubernetes

Fortunately, Google (in an effort to challenge AWS) brought us Kubernetes. With Kubernetes came an ever-expanding universe of free and commercial products, tools, plugins, add-ons, and sometimes even just a bunch of bash scripts to meet every need related to Kubernetes.

Everyone who had to automate anything in the cloud instantly fell in love with Kubernetes. It was like a "mini cloud," and since it was driven by APIs, everyone was forced to learn and understand it. Without a UI to click around, people grew tired of typing and began writing scripts. Scripts turned into tools (mostly in Go), and soon everyone was obsessed with showing off their latest tool that made their lives easier. We wrote integration after integration, only to realize that someone else had written the same thing, and now it was an open-source project we could install with a single "helm" command.


The beginning of Platform Engineering

We moved on to using Helm charts, which allowed us to deploy and update complex distributed applications with a single click, and operators that automatically managed day-two concerns like zero-downtime upgrades, backups, disaster recovery, and auto-scaling. In a few minutes, you could have a complete infrastructure of complex systems, which, to be honest, we didn't fully understand—but since it seemed to work, we were satisfied.

At that time, platform engineers spent most of their time trying to get the basics together—some level of security, access control, and automation—but everything was just a "some sort of" solution, lacking depth. Without really being ready, we forced teams onto our shiny new platform and told them, "Hey, this is Kubernetes, and this is kubectl—good luck!" Unfortunately, unlike us, who had spent the last couple of years poking every hole in Kubernetes, developers were terrified (with good reason) of the amount they had to learn to use this new system: deployments, services, disks, configurations, secrets, network rules, security profiles, logs, metrics, traces, dashboards, alerts—the list went on and on.

We knew it was difficult, and we knew there was a better way—by building simple self-service abstractions around all those concerns. But at the time, building those abstractions was complex and time-consuming, and we still didn't have our own setup fully figured out. So, developers struggled and eventually came to hate it.


The Evolution of Platform Engineering

For most companies, management thought this was already a great place to be, as they had regained the missing infrastructure governance. However, the teams were not moving faster—certainly not faster than before using the cloud. Additionally, since so many things were left for developers to figure out, we still had a fragmented (though less so) internal Kubernetes infrastructure.

Almost nine years have passed since my first KubeCon in Berlin. Since then, the Kubernetes ecosystem has matured to its full potential. Most enterprises now use Kubernetes for production workloads, and, amazingly, most of the tools are open-source and free to use. Now we have amazing tools—mainly operators—that deal with monitoring, network security, policies and compliance, data services (like databases), cloud integrations, scaling, and more. Now we finally have the right toolset to enable complete, simple-to-use self-service platforms.

Now developers can simply declare their resource needs, and everything—from the GitHub repository to monitoring dashboards, passing through security policies and data services—can be automatically provisioned with no human intervention, while at the same time automatically applying all the needed controls.

Development teams can now focus on building features and services needed by the business, without worrying about infrastructure setup, security, or monitoring. The platform abstracts away this complexity while ensuring that critical non-functional requirements—like security, capacity management, and monitoring—are consistently handled across the board.

This is what I call "having your cake and eating it too."


The Kubenoops Tech Stack

To show this isn't just theory or an article generated by AI, I'll share our tech stack at Kubenoops (the company I founded). This will give you the ingredients, and as for the recipes, I will be sharing them regularly. My team at Kubenoops and I will be writing blogs regarding self-service, security, compliance, monitoring, and making applications (which is what businesses really care about) production-ready (for which I have my own definition too).


Our Stack

  • Platform: Kubernetes, EKS, AWS, GCP, Bare Metal, OPA Gatekeeper, Keda autoscaler, cluster autoscaler, node termination handler.

  • Infrastructure as Code: Terraform, Terragrunt, Crossplane, Helm.

  • Monitoring: Prometheus Operator, Grafana, Mimir/Cortex, Loki, Logging operator, Fluentd/Fluent Bit.

  • DevSecOps: Cilium, CrowdSec, Harbor, Trivy, Vault, Snyk, SUSE NeuVector.

  • Data Management Services: Kafka (Strimzi Operator), PostgreSQL (CNPG Operator).


Conclusion

When I founded Kubenoops, my vision was to empower small and medium-sized companies with platforms like the one I've explained in this article, allowing them to enjoy enterprise-grade features without the enterprise price tag.

The journey from manually managing infrastructure to creating a fully automated platform has been challenging, but the rewards are immense. Platform engineering is about building a solid foundation that empowers development teams to innovate and create without being overwhelmed by infrastructure complexities while still fulfilling bussines requirement regarding security, cost control and compliance. 


 
 
 

Comentarios


Office M-05, Millharbour Court, 6 Watergate Walk, London, England, E14 9XH

Subscribe to Our Newsletter

Contact Us

bottom of page