Stack8s is a Kubernetes automation platform built around data sovereignty. It deploys vanilla Kubernetes clusters on hardware that customers own and control, whether that is a bare metal server in their own data centre, a private cloud, or any combination. Customers link their own hardware into clusters, control which workloads run in which environment, and are never locked into a managed cloud provider's Kubernetes offering. The vision was right. The infrastructure holding it together was not. Eprecisio joined as the founding engineering partner, rebuilt the entire platform from the infrastructure layer up, and has been the core delivery team through the product's growth to its current position as a recognised player at KubeCon.

How the relationship started

The engagement did not start with a Kubernetes platform. It started with a healthcare project.

Ehtisham joined Dr. Jeremy Murray's team to work on a healthcare compliance project. The team was small, the stack was complex, and there were in-house challenges managing the infrastructure to the standard that healthcare compliance demands. Ehtisham stepped in individually to address those blockers.

That engagement built the trust that led to Stack8s. When Dr. Murray started building his vision for a Kubernetes automation product, Eprecisio was the partner he turned to. The relationship that started with one engineer on a healthcare project is now a team of 5 to 6 engineers working full-time on a product that is being presented at KubeCon.

Stack8s is a commercially ambitious product. Its differentiator is data sovereignty. Customers bring their own hardware, register it into the platform, and get production-grade Kubernetes without handing their workloads to a managed cloud provider's managed service. They control what runs where. The platform also ships a marketplace of plugins that teams can deploy directly into their clusters: AI Architect for AI workflow orchestration, Kubeflow for ML pipelines, Laravel stack integrations, and 100+ other open source tools. All of this is available through the Stack8s interface without the customer needing Kubernetes expertise. To deliver that experience credibly, the platform itself has to be faultless.

The state of the platform when active development began

Seven months ago, when the current active engagement began in earnest, the platform was failing repeatedly. Not occasionally. Continuously.

The core problem was that the architecture had accumulated instability at every layer. Networking was unreliable when customers connected their own hardware from different environments. State management did not exist in any meaningful form, so the platform had no consistent picture of what was running, what had failed, or what needed attention.

Area	State at the start	Impact on customers
Platform stability	Continuously failing, no root cause tracking	Customers could not trust clusters they provisioned
State management	No unified state layer	Node status, provisioning state, and cluster health were inconsistent across views
Networking	Unreliable when customers connected hardware from different environments	Workload connectivity failed silently when hardware was registered from mixed environments
GPU management	No operator-level control over GPU allocation	ML teams could not rely on GPU provisioning
Marketplace	Charts deployed inconsistently, no deployment framework	100+ open source charts had no reliable install path
Customer onboarding	Node registration and NACL creation unreliable	New customer setup required manual intervention
Alerting	No structured alerting or status notifications	Failures went undetected until customers reported them
Pricing	Connectivity issues with external cloud provider billing APIs	Cost data was inaccurate or unavailable

When funding came in and the product needed to scale, the architecture underneath it was not ready. The decision was made to stop patching and do a full rebuild.

The team and how the engagement evolved

The engagement grew the way most of our strongest relationships do. It started with one person, proved its value, and expanded as the scope became clear.

Role	What Eprecisio owns
Platform engineering lead	Infrastructure architecture, Kubernetes operator design, cross-cloud networking
DevOps engineers (x2)	CI/CD, cluster lifecycle management, ArgoCD GitOps, Terraform modules
Full-stack engineer	Platform UI, customer-facing APIs, marketplace frontend
Product manager	Roadmap, PRDs, delivery process, sprint management
AI-native development	AI-assisted feature development and code quality processes

This is not a vendor relationship. Eprecisio owns the roadmap process, manages delivery, writes the PRDs, and makes architecture decisions. Dr. Murray focuses on business development, partnerships, and product vision. The engineering execution is ours.

The rebuild: what we actually did

The 2-month rebuild was not a rewrite of features. It was a reconstruction of the foundation the features run on.

Infrastructure and state management layer. The platform had no consistent state model. We designed and implemented a state management architecture that tracks every cluster, node, and workload across all three cloud environments in real time. Every provisioning operation now has defined state transitions with persistence and recovery paths.

Networking for bring-your-own-hardware. Stack8s does not provision managed Kubernetes services. It deploys vanilla Kubernetes clusters on hardware that customers register from wherever that hardware lives. That means the networking layer has to handle arbitrary hardware from arbitrary environments connecting into a single control plane. We rebuilt the networking layer to handle hardware registration from any environment, normalise the connectivity model across mixed infrastructure, and maintain stable cluster networking as customers add or remove nodes from different physical or virtual locations.

GPU operator and compute management. We integrated the NVIDIA GPU Operator with custom resource allocators that give the platform real control over GPU scheduling, allocation, and monitoring across customer clusters.

Service mesh. We designed and implemented the service mesh layer for inter-cluster communication, traffic management, and observability, resolving the connectivity issues that had made the platform unpredictable.

Marketplace and plugin framework. Stack8s ships a marketplace of plugins that customers deploy directly into their clusters from within the platform. This includes AI Architect for AI workflow orchestration, Kubeflow for ML pipelines, Laravel stack integrations, and 100+ other open source Helm charts. We rebuilt the framework that governs how plugins are packaged, versioned, deployed, and updated across customer clusters, so every chart in the marketplace installs reliably regardless of what hardware the cluster is running on.

Customer onboarding infrastructure. Node registration and NACL creation for new customers were manual and error-prone. We automated the full onboarding flow so new customer environments provision without manual intervention.

Component	What we rebuilt	Technology
State management	Unified state layer across all cloud providers	Custom Kubernetes operators, etcd
Networking for BYOH	Stable cluster networking across hardware registered from any environment	Vanilla Kubernetes networking, custom node registration layer
GPU management	Operator-level GPU provisioning and allocation	NVIDIA GPU Operator, custom allocators
Service mesh	Fast, stable inter-cluster communication	Custom service mesh implementation
Plugin marketplace	Deployment framework for AI Architect, Kubeflow, Laravel stack, 100+ charts	Helm, ArgoCD, custom chart operator
Customer onboarding	Automated node registration and NACL creation	Terraform, Kubernetes admission controllers
Alerting	Structured cluster and node health alerting	Prometheus, Alertmanager
GitOps pipeline	Automated cluster lifecycle management	ArgoCD, GitHub Actions
Pricing integration	Reliable connectivity to cloud provider billing APIs	CAST AI integration, custom billing adapters

The hardest parts

Redesigning the infrastructure layer without taking the product offline. Stack8s had paying customers during the rebuild. The platform could not simply go dark for 2 months. The approach was to build the new infrastructure layer in parallel, migrate workloads incrementally, and cut over component by component.

Networking for arbitrary hardware configurations. Because Stack8s registers customer-owned hardware rather than provisioning managed cloud nodes, the networking layer has to handle a much wider range of physical and virtual configurations. Customers were registering nodes from bare metal, from private clouds, from VMware environments, and from various provider setups. Getting the control plane to maintain stable connectivity across all of these took significantly longer than a more constrained networking model would have.

State recovery for existing clusters. When we introduced the new state management layer, existing customer clusters had no state history. Building a reconciliation process that reconstructed accurate state for live clusters without disrupting them was the most technically delicate work of the rebuild. A single error would have made existing deployments unmanageable.

Dead code and architectural debt. The AI-assisted rebuild process surfaced a significant amount of duplicate and dead code. Removing it required understanding which code was genuinely unused versus which was reached through uncommon paths not obvious from static analysis. This took longer than a clean codebase would have, but it was the right call.

Results

Metric	Before	After
Platform stability	Continuously failing	Stable. No recurring systemic failures since rebuild.
State management	No consistent state	Real-time state tracking across all clusters and nodes
Customer onboarding	Manual intervention required	Fully automated node registration and environment setup
ML setup time	Weeks of manual GPU cluster configuration	Hours with automated GPU provisioning
Release velocity	Blocked by instability	Regular feature releases on structured sprint cadence
Chart deployment	Inconsistent, manual troubleshooting	Reliable across all 100+ open source charts
Team model	1 embedded engineer	5 to 6 engineers, PM, roadmap ownership
Product positioning	Pre-funding, unstable product	KubeCon presence, CAST AI partnership

"I've been working with Ehtisham personally since 2020, and when he brought the Eprecisio team in to rebuild and stabilize our platform, the quality of work matched everything I'd come to expect. The team hit the ground running, shipping features and cleaning up the platform without constant oversight. What I appreciate most is that the trust built over years of working together translated directly into how the team operates. Met Ehtisham a few times in person in London and the professionalism is consistent across the board. If you're a technical founder who needs a team that takes real ownership, Eprecisio delivers."
Dr. Jeremy Murray, Founder at Stack8s

Where the product is now

Stack8s is no longer a product that is struggling to be stable. It is a product built on a clear and defensible position in the market: organisations that need production-grade Kubernetes without surrendering data sovereignty to a managed cloud provider. That means your hardware, your environment, your rules on where workloads run. Dr. Murray is now taking that product to KubeCon, presenting at Kubernetes automation working groups, and building partnerships with infrastructure players like CAST AI around it.

The Eprecisio team is not winding down. The engagement is actively growing. Dr. Murray has explicitly asked to scale the Pakistan-based engineering team further rather than continuing to hire in the UK, where previous direct hires did not work out.

For how we structure and manage production Kubernetes infrastructure at this scale, see our InfraOps service.

If you are building an infrastructure platform and need a team that can work at this level of technical depth and own the delivery process, book a free 30-minute call.

Stack8s: Rebuilding a Kubernetes Platform That Actually Works