In my last blog exploring the three architectural layers of platforms, I argued that an effective platform must support an organisation in three ways:
Go faster: Platform teams need to provide “everything as a service” to help rapidly and sustainably deliver value to end-users
Decrease risk: Teams need to automate key business processes in reusable components
Increase efficiency: You need to manage and scale your digital platform and resources as a fleet
Another way of saying this is that a platform must enable you to deliver software that adds value to your customers with speed, safety, and at scale. This may appear obvious at first glance, but my experience is that it’s all too easy to forget one of these pillars when designing a platform – or to get sold on a solution that misses one.
An interesting thought experiment is to imagine what a platform that doesn’t support one of speed, safety, or scalability might look like. After much debate in the Syntasso team, we believe the result is building a platform that can be “slow, low (level) or just for show”…

Common platform engineering pitfalls: Tickets, Raw Cloud/K8s, and Templates-as-a-Service
Several common antipatterns hinder speed, safety, and scalability in platform engineering. Three major pitfalls are:
Ticket Systems: Reliance on ticket-based workflows often leads to bottlenecks and delays, reducing developer autonomy and slowing value delivery.
Raw Cloud/Kubernetes Access: Providing unrestricted access increases risks related to compliance, security, and maintainability, leading to unsafe practices.
Templates-as-a-Service: Over-reliance on rigid templates dispensed via portals and pipelines can limit flexibility and lead to platform decay as teams struggle to adapt templates to evolving needs.
In another (more punchy) way, tickets are slow, raw resources are too low-level, and templates-via-portals are great to demo but often end up “just for show.” Everyone in the business suffers. Leadership is angry that features aren’t being shipped fast enough, developers are annoyed at the infrastructure and compliance plumbing, and operators struggle with the day-two experience of maintaining and upgrading the platform components.
Recognising these antipatterns is essential to building platforms that empower developers, operators, and, critically, everyone else in your organisation to align with business goals. After all, platform engineering is a multiplayer game.
Ticket Systems: More haste, less speed
Ticket systems are commonly used to provision, change, or upgrade platform resources. Enterprises implement ticket systems for platform engineering to manage complexity, ensure compliance, and maintain control over critical workflows. However, the tradeoff is that developers lose autonomy and the ability to deploy software rapidly and deliver value to end-users.
An all too familiar story: It doesn’t exist until it’s in Jira
Need to provision a new environment to test the deployment of a new app that is sure to deliver business value? File a Jira ticket and check back in a month. Want to upgrade log4J across your apps to address a zero-day issue? Email a Zendesk ticket, and we’ll happily reply in a week and ask lots of questions about version compatibility. Need a database configuration change to keep up with Black Friday demand? Add something to ServiceNow, and we’ll email you when it’s fixed.
Developers really want a self-service platform. They want to call APIs, not raise tickets or send emails. They also want a clear contract. Knowing the platform service’s parameters and default configuration saves all the back-and-forth of ticket systems. Developers also want versioned and supported systems. I may need to upgrade my databases to support Black Friday traffic and accept the tradeoffs this brings, but another team may want to stick with their current database version.
Speed: What does good look like for access to platform services?
|
What are the signs that this is hampering your organisation?
Entrepreneurial developers will typically attempt to overcome ticketing system limitations by embracing shadow IT to keep delivering at pace. The challenge is that the enterprise loses visibility into these platform components and that the organisational policy and compliance are not applied uniformly. Other developers will attempt to game the systems by raising speculative tickets, such as requesting environments before a project is greenlit or resigning themselves to moving slowly (“that’s just how things are done here”).
Raw Cloud and Kubernetes: Unsafe abstractions
Enterprises sometimes give developers full cloud account access or admin-level control of a raw Kubernetes cluster when they deploy software to promote agility and reduce bottlenecks. However, this approach comes with trade-offs, chiefly regarding ensuring consistent application of the organisation’s safety and compliance goals.
Swapping DR/BC for YOLO
Need to ensure that all customer data is encrypted at rest and easily accessible in a disaster recovery/business continuity (DR/BC) scenario? Good luck figuring out if all of your credit card-powered as-a-service offerings are doing this. Want to prove to the auditors that all services comply with PCI DSS requirements? I’m sure the Kubernetes cluster that was provisioned on a third-party K8s-aaS last year will be easy to verify (less YAML, more YOLO). Want to ensure that all cloud accounts follow standardised data access policies? It will take weeks to verify the thousand accounts that have bloomed manually.
Platform teams want to provide services that are safe and compliant by default. They also want to provide services that are appropriately abstracted and customised to their organisation’s requirements. To do this, platform teams must act more like “platform groups” and enable all stakeholders in the organisation to contribute to the platform services.
Safety: What does good look like for safely consuming platform services?
|
What are the signs that this is hampering your organisation?
Security breaches often become more frequent due to misconfigured resources, leading to vulnerabilities. Compliance failures can emerge as teams struggle to pass audits because of inconsistent policy applications. Operational overhead increases as teams spend excessive time troubleshooting unique configurations. Shadow IT grows when developers bypass official platforms to gain flexibility, resulting in unmanaged resources. Finally, incident frequency rises, leading to more outages or degraded services due to mismanaged infrastructure.
Templates-as-a-Service: Unable to scale past day one
Templates-as-a-Service is intended to accelerate development by providing pre-built scaffolding and configurations for common patterns. This approach is often combined with a portal (and related plugin), for easy access or vending, and a continuous delivery pipeline to ensure the service can be redeployed easily. However, over-reliance can lead to several issues.
“One-size-fits-all” issues arise when teams struggle to adapt rigid templates to specific needs, leading to workarounds or shadow IT. With unclear ownership of portal plugins past day two, updating services at scale becomes impossible, as any link between the initial templated code and what is running now has vanished. The portal acts in an analogous way to a dashboard in a car, but this is only useful when paired with an engine; a portal without an orchestration engine can act simply as a facade over infrastructure or pipeline API calls.
Building on this, we often see pipelines turn into a snowflake, which in turn deploys snowflake services. Over time, this causes platform decay as templates drift from actual usage patterns, degrading platform coherence and increasing operational complexity.
Scalability: Fleet management is not just for shipping and cars
Want to modify a template and update all previously initialised services that used this? The best-case solution is searching for template patterns in your version control system and making pull requests against all the current snowflake systems (now unique and hand-crafted). Need to roll out a security upgrade across your entire IT estate? Good luck in orchestrating this without a lot of manual effort.
Platform teams want to provide services that are scalable and fleet-manageable by default. They recognise that platform services are not produced like washing machines rolling off an assembly line. Instead, they are more like modern mobile phones that accept over-the-air updates that can impact software and hardware.
Scalability: What does good look like for scaling the management of platform services?
|
What are the signs that this is hampering your organisation?
Scalability challenges often manifest as inconsistent environments, where frequent bugs arise due to differences between development, staging, and production. Operational overload is common, with platform teams bogged down managing unique deployments instead of focusing on innovation. Delayed time-to-market becomes a reality when lengthy provisioning times and manual processes slow releases. Finally, burnout among platform engineers becomes a growing concern as they face constant firefighting rather than strategic improvements.
Platform speed, safety, and speed are possible!
At the start of the article, we talked about running a thought experiment to imagine what a platform that doesn’t support all of speed, safety or scalability might look like. When we’ve done this with teams, our discussion can lead to the conclusion that building a platform that supports all three properties is nearly impossible. Some folks have proposed that building a platform follows the project management “iron triangle” dilemma of “Good, fast, cheap. Choose two." However, following in the footsteps of DORA team (responsible for the State of DevOps report), we know it’s possible to “have your cake and eat it too,” in that an effective platform can support speed, safety, and scalability.
We’ve baked our experiences into the Kratix framework. Please contact us to learn how Syntasso Kratix Enterprise (SKE) can help your organisation build a platform that supports speed, safety, and scalability.
Comentários