The Fine Art of Breaking My Own System (On Purpose)

The Fine Art of Breaking My Own System (On Purpose)

Building CodexMCP is a daily exercise in controlled chaos. I know how to set up automation, I know how to make systems hum, and I certainly know how to push hardware to its limits. But there’s a difference between knowing something in theory and making it work in practice, at scale, without setting my Proxmox host on fire.

Lately, I’ve been refining CodexMCP’s full ISP infrastructure deployment, where 9 core VMs now auto-provision like clockwork:

  • os-core-1, os-core-2, os-core-3 – OpenSearch backbone
  • interfaces-core-1 – Development interfaces, OpenSearch Dashboards, Grafana
  • mariadb-core-1 – SQL data store
  • dns-core-1 – DNS services
  • dhcp-core-1 – Kea DHCP
  • apt-cache-1 – Package caching for efficiency
  • logs-core-1 – System-wide logging

This stack spins up, installs its software, and configures itself automatically. That’s the dream, right? Well…

What Happens When You Automate Too Well?

Once everything was automated, I did what any reasonable person would do—I tried to make it even faster. The original plan? Multi-thread the software installs across all VMs after cloning. The reality? I/O bottlenecks punched my system in the face.

It turns out, running 9 VMs and 4 VyOS routers inside a single Proxmox host with spinning disks on ZFS was the equivalent of forcing a medieval town to build skyscrapers overnight using only wooden ladders.

With just two installs running at the same time, Proxmox started gasping for air. At four? Everything locked up like an overzealous firewall rule. It wasn’t CPU or RAM—it was disk IOPS melting down under ZFS’s copy-on-write behavior.

Lessons in System Design (A.K.A. “Yes, I Should Have Seen That Coming”)

I know ZFS well. I know spinning disks aren’t built for this level of punishment. I also know that multi-threaded installs seemed like a great idea until the whole system started choking on metadata updates. This is the kind of thing you only truly appreciate when it happens to you in real-time.

So, after watching my Proxmox instance flail around like a DSL modem at max line attenuation, I rolled back and made the call:

Single-threaded installs only, for now.
Performance tuning comes later, after baseline automation is stable.

The takeaway? You don’t just need to know how to automate. You need to understand deeply what you’re automating and what the system underneath can handle. Otherwise, you’re just adding speed to a train that’s already derailing.

What’s Next?

With the provisioning process locked down, the next phase is optimizing install efficiency, storage performance, and AI-driven automation. The system works, but I want it to work faster, smarter, and without turning Proxmox into a black hole of I/O wait times.

Until then, the lesson remains:
Sometimes, the best way to move fast is to slow down first.

-Its Always Something
--Bryan