In the ever-widening gyre of technology, we often find ourselves tinkering, adjusting, and pivoting. Today, I invite you on a journey through the layers of my monitoring setup, an expedition as illuminating as it is complex. We’ll explore why I shifted from a Kubernetes-based infrastructure, unravel the many roles of my assorted proxies, and delve into the realm of Elasticsearch. This is not just about technology; it’s about decision-making under constraints and the ceaseless quest for efficiency.
Why this particular setup, you might wonder? The answer lies partially in the requirement for agility, but mostly in the necessity for an intimate understanding of my external sites’ metrics. Though they might be low-traffic, these sites are more than mere placeholders on the web. One serves as a vital component of my résumé—a digital fingerprint in a world teeming with anonymous connections. So, it becomes paramount to not only know but also understand the traffic it attracts, a function served elegantly by monitoring and metrics.
This post aims to offer a comprehensive guide to my recent architectural decisions. It’s a deep dive, a detailed account for those keen on grasping both the larger picture and the minute, intricate details of a system designed for the curious, the questioners, and those unsatisfied with surface-level explanations.
So sit back, pour yourself a cup of something warm—or perhaps something a bit stronger—and let’s embark on this technological odyssey together.
The Starting Point: Kubernetes
Kubernetes is like a celestial dance of containers—precise, scalable, and almost poetic in its ability to manage a fleet of applications. Designed for high availability and fault tolerance, it offers seamless scaling and an array of features that, for a time, made it my go-to solution for monitoring my sites. Its orchestration abilities are undoubtedly powerful, automating deployments and balancing loads like a ringmaster in a digital circus. Especially its handling of Let’s Encrypt SSL certificates.
However, all dances come with their missteps. In Kubernetes’ case, these were its resource-hungry nature and complexity that only magnified as my needs became simpler. The layered abstractions it offered, though intellectually alluring, became burdensome for a small-scale operation like mine. It was akin to bringing a sledgehammer to tap in a nail; effective, yes, but overkill nonetheless.
And so, the decision to part ways was made not out of dissatisfaction but out of pragmatism. I found myself yearning for a leaner architecture, one that would allow me the freedoms I sought without the overhead and the maintenance demands. The purpose was to reserve computational power for tasks that truly required it, and to set up a more tailored monitoring system without the weight of a Kubernetes cluster hanging over it.
Thus, I turned my gaze toward a more custom-tailored solution, piecing together various components like a digital Frankenstein, each serving its unique function in the grand scheme of things. It was time to bid Kubernetes adieu, at least for this chapter of my technological journey.
The Role of OpnSense NAT
In the architecture of this digital symphony, OpnSense plays the first violin. It’s the initial point of contact for any traffic destined for my internal network. Unlike commercial-grade solutions that often come with a horde of features many will never touch, OpnSense provides a lean yet robust interface for network management, making it an ideal choice for my needs.
At its core, OpnSense functions as the guardian of the castle gates, deciding who gets through and who stays out. It filters incoming web traffic and routes it to the appropriate internal addresses. You might think of it as the initial triage nurse in an emergency room, evaluating cases and sending them where they’ll receive the best care—or in our case, the appropriate service.
However, one has to ponder on where to place the load of computational tasks. While OpnSense is more than capable of handling domain-based routing, the strategy was to keep it as unburdened as possible. The thought process here was akin to reserving your best swordsman for the fiercest battles; why exhaust your frontline resource with tasks better delegated elsewhere?
Therefore, I made the calculated decision to offload domain-based routing to HAProxy using OpnSense NAT. This allows OpnSense to focus primarily on its security functionalities. This separation of duties allows each component to excel in what it does best, thereby increasing overall efficiency and stability. OpnSense continues to serve as the critical first stop, but without the additional labor of intricate decision-making regarding domain traffic, ensuring it performs its core duties with maximum efficiency.
The Bridge: HAProxy
Stepping beyond the guarded gates of OpnSense, we encounter HAProxy—the maestro directing the flow of data like a conductor guiding an orchestra. Stationed in its own realm—a Virtual Machine situated outside the OpnSense purview—it’s here that the real alchemy of domain-based routing takes place.
HAProxy performs a dual role, managing both HTTP and TCP traffic. For the HTTP dance, it listens carefully to the domain names requested and routes them accordingly. In a parallel tune, it directs TCP traffic, particularly for SSL, ensuring encrypted data finds its way to the right end. Imagine it as a seasoned jazz musician, effortlessly switching between the trumpet and sax, giving each song its distinct melody.
So why anoint HAProxy as the chosen traffic cop amidst a universe of alternatives? The decision rests on its ability to wear multiple hats without breaking a sweat. It’s capable of complex routing decisions based on domain names for HTTP traffic, while also allowing the flexibility to handle plain TCP, essential for SSL traffic. By being incredibly configurable and resource-efficient, it suits the philosophical backbone of the system: lean, mean, and purpose-driven.
The idea here is not just to move packets from Point A to Point B, but to move them wisely. Routing decisions are often the unseen heroes in a network’s performance and reliability, and HAProxy excels in making those decisions quickly and efficiently. It is a critical lynchpin in the grander scheme, bridging the austere simplicity of OpnSense with the specialized roles of Nginx and Elasticsearch. It’s the negotiator, the interpreter between what the world wants and what my internal network is prepared to offer.
The Gatekeeper: Nginx
Within the same breath of the virtual machine that hosts Elasticsearch, there resides another entity—Nginx. Think of it as the gatekeeper, the final arbiter that decides who gets an audience with the inner sanctum of data. It’s as if the VM itself were a bustling marketplace, and Nginx is the seasoned vendor who knows just where to place the fresh produce and how to deal with wandering eyes.
Nginx is a master of disguises, presenting two distinct faces. The first is that of the humble notary, verifying the credentials presented at the gates. This is where the Let’s Encrypt certificates come into play. Port 80 stands ever vigilant, open to the winds of the internet but solely for the purpose of confirming that, yes, the domain you are trying to reach is indeed in safe hands. Nginx handles the initial handshake, procuring and renewing these certificates so that any data that flows in and out is draped in a cloak of legitimacy.
The second guise is that of the stern bouncer at an exclusive club, operating under the shroud of port 443. Here it not only unwraps the SSL encryption, peeling off the layers to reveal the core data, but also carries out a final vetting through Basic Authentication. It asks for a password, a secret handshake before ushering you into Elasticsearch’s chambers. Because Nginx and Elasticsearch share the same living quarters—i.e., they reside on the same VM—this interaction is as seamless as it is secure.
But why place this burden on Nginx when Elasticsearch is capable of handling some of these tasks? Because specialization, my friend, is the mother of efficiency. By offloading SSL decryption and Basic Authentication to Nginx, Elasticsearch is free to do what it does best: sift through data at breakneck speeds, unburdened by the pleasantries of initial greetings and farewells.
In the end, Nginx is the multi-faceted jewel in this architecture, complementing the ensemble by fulfilling roles that help each of the other components stay true to their primary objectives. It keeps the certificates fresh, the data encrypted, and the unauthorized at bay, all while living harmoniously beside the very data it protects.
The Core: Elasticsearch
At the heart of this epic setup resides Elasticsearch, the core where all trails converge. If the other components are the limbs, the conduits and the gatekeepers, then Elasticsearch is the mind—analyzing, sorting, and delivering insights. Its primary role is that of the wise librarian, meticulously cataloging logs and metrics so they can be called upon when needed. Imagine a well-tended garden of data, each log a unique flower, each metric a rare herb; Elasticsearch is the gardener who knows every species by heart.
But the question arises—why version 7, especially when the tech world is ever-advancing, and version 8 is out there boasting newer capabilities? The answer lies not in what is possible, but in what is practical. You see, I’m not building a space shuttle; I’m crafting a well-oiled, functional cart. Version 7 brings to the table a simplicity and directness in the setup process that is particularly compatible with my choice of using Nginx as a forward proxy and SSL handler. Version 8, with its newer features, simply offers capabilities that, though enticing, are not essential for the nature of this project.
Setting up certificates and utilizing an Nginx forward proxy turns out to be smoother with Elasticsearch 7, as if the technology itself knows its place within the grand design. The setting sun is as beautiful as the midday sun; it doesn’t have to be high noon to appreciate the light. Similarly, the capabilities of Elasticsearch 7 are fully sufficient for my needs; the version is mature, stable, and like a vintage wine, it has aged well for the purpose it serves in this specific ensemble.
Thus, Elasticsearch 7 is my choice—a choice rooted in the soil of practicality, optimized for a very specific constellation of needs and parameters. It’s a reminder that the latest is not always the greatest for every situation, and that sometimes the wisdom lies in choosing the tool that fits the hand, not the one that merely shines the brightest.
Mapping the Data Flow
In a realm where bits and bytes are the currency, understanding the flow of data is akin to tracing a river’s path from its mountainous origin to its estuary. Let’s embark on this digital voyage, following the life cycle of a single request as it traverses through our carefully architected system.
- The Entry: The journey begins with filebeat and metricbeat requesting access to Elasticsearch on por 443. This initial contact with the digital world is captured by OpnSense NAT, acting much like a harbor master guiding ships into a busy port.
- The Handoff: OpnSense quickly ushers this request to the HAProxy VM. Think of HAProxy as a well-placed bridge that knows the lay of the land, directing HTTP and HTTPS traffic based on domain names like a seasoned traffic cop.
- SSL and Certificates: If it’s an HTTPS request, HAProxy sends it straight to Nginx on port 443. Here, Nginx wears two hats: it’s the gatekeeper verifying the SSL certificate and the handler for Let’s Encrypt renewals on port 80.
- Basic Authentication: Before proceeding, Nginx checks for Basic Auth credentials. If everything checks out, it prepares the request for its final leg of the journey.
- Final Destination: Nginx takes this authenticated and encrypted data and reverse proxies it from port 443 to Elasticsearch’s port 9200. Now, the request has reached the inner sanctum—the mind, the core, the beating heart of the system—Elasticsearch.
- Data Processing: Elasticsearch takes the incoming data, be it a log or a metric, and adds it to its organized repository. Queries are processed, insights are generated, and the information becomes a part of a larger, ever-evolving tapestry.
It may be helpful to visualize this complex dance with a diagram, providing a bird’s eye view of this intricate choreography of data. The lines of flow would look like the veins of a leaf, each playing its part in delivering lifeblood to the system.
So there it is—a map, a blueprint of the winding roads that every request travels. It’s a testament to the thought invested in each component, each turn and crossroad, and the seamless harmony they achieve together. This isn’t just a stack of technologies; it’s a symphony of purposeful choices, each note resonating with the ones that came before and after.
In the caverns of code and the valleys of virtual machines, we’ve undertaken a journey through a landscape molded by choices. Choices that reflect not just a need for operational efficiency but also a desire for architectural elegance. We’ve seen how one can build a fortress of logic, piece by piece, drawing from various tech realms to craft a monitoring system tailored to individual needs.
We began this odyssey with the dissolution of our alliance with Kubernetes, recognizing that sometimes the path toward simplicity requires the courage to depart from complexity. From there, we constructed bridges and gates—OpnSense, HAProxy, and Nginx—to usher our data safely and efficiently to its destination: the robust, reliable Elasticsearch 7.
What should you, dear reader, glean from this tapestry of configurations and solutions? Firstly, that no tool is an island. Each exists in a larger ecosystem, and knowing how to weave them together is a craft unto itself. Secondly, understand the weight of choice. We opted for Elasticsearch 7 over 8, not out of technical incapability but because it suited our needs without the unnecessary layers of complexity.
In the end, this post is less about the technology and more about the philosophy of choice. Whether it’s in deciding which firewall to use or how to funnel your traffic, each decision stems from a fundamental understanding of your own needs. And as those needs evolve, so too will this architecture, ever mutable like the river it resembles, carrying with it the marks of choices past and future possibilities.
Thus, our digital narrative finds its current rest, not as an end, but as a waypoint on the everlasting road of technological evolution. May this post serve as a map for your own journey, a catalyst for your own questions, and perhaps, an inspiration for your own solutions.