vulcanridr

Monitoring vs observability...And getting my brain around it

As I said in an earlier post, I sought to replace Zabbix as my monitoring solution (which I have accomplished), because it was entirely too heavy for a 20-30 host homelab environment, plus having, um, less-than-stellar support for BSD or ZFS (like reporting "pool down" when a single drive errors, which is not a good look to management).

So I looked around for other options, and found Monit and M/Monit, which looked like a winner...Until the company told me that a license for my 20 node homelab is $250/year to get updates (the license is at least currently perpetual, but you only get updates to the software for the first year). So in my mind, I was like "yeah, that's not happening." I pointed out that I had introduced other open source tools to work where we have several hundred nodes, but their response was radio silence. I will find out what happens to the demo license in 10 days.

So it was back on the hunt. I am basically looking for a monitoring tool for my homelab that doesn't look like something from 1998 that you have to configure each host by hand for every monitored service (lookin' at you, Nagios...), and is aware of unique features of FreeBSD and ZFS (and maybe at some point supporting of unique features of the other BSDs), instead of a monitor that is full of linux-isms.

I'm learning a lot on the fly...Like the fact that there is a difference between "monitoring" and "observability," with monitoring being a status of hosts, gear, and services, up/down, and other functional features of the network, whereas observability gives you a timescale second-by-second of how a host or piece of gear is performing, and graphs over time. From a monitoring sensibility, it is like drinking from a firehose, and at this point in research, it seems that alerting is a bolted-on afterthought.

So next up on my list of tools to try is grafana + influxdb + telegraf. One of the problems with this solution is that it seems to be a rapidly moving target that the documentation seems to have a hard time keeping up. In addition, the tool set seems like a monitoring erector set. Or, at the risk of showing my age, one of those Radio Shack project kits where you get a piece of breadboard and pile of components and you are supposed to build a radar detector or a crystal radio or some such...With instructions written in Taiwan in broken English.

What I have figured out so far is that you have one or more applications (called data sources) that run on the end piece of gear (host, switch, application, etc), e.g. node exporter or telegraf. These scrape data from the host and push it to, what I refer to (for lack of a better term) the middleware. This can be something like influxdb or prometheus. It appears that prometheus pulls scraped data from the endpoint (e.g. node_exporter), whereas influxdb has data pushed from the endpoint (e.g. telegraf). In addition, the grafana website has a metric crapton of plugins and application-specific end apps, like loki for logging, jaeger, etc.

The last layer of this observability game of pick-up sticks is grafana itself, which is the display engine. You can plug in one or more "data sources" that you configured above, that are getting feeds from the scrapers. On top of that are hundreds or thousands of dashboards that will display your data for you. Many of them are specific (like there are several for vmware, many, many for linux, cisco switches, etc.) It appears most of the dashboards that are BSD related are specifically for BSD appliances, and the generic server dashboards are all full of linux-isms. I haven't had time to learn how to create dashboards of my own, let alone trying to add the bits and bobs to make all of the functionality work to get the information I wish to see. It feels very much like, as my boss is fond of saying, trying to assemble the plane while you are doing your takeoff roll.

To complicate things even further, grafana has three tiers: Open Source, Cloud, and Enterprise, and I'm presuming that some things work in in only a subset of their offerings, specifically only in the paid offerings.

So a couple of years ago I had set up grafana with prometheus and node-exporter, and it worked reasonably well, until one of the data points it scraped for ZFS stats was taken away in a ZFS update. A that point, node-exporter started spamming my logs with errors, enough so that the logs were rolling over every couple of hours. However, I have read that the search for that data point was fixed in node_exporter, so I may try setting up prometheus + node exporter on the jail on which influxdb + telegraf lives, and feeding them both to grafana.

I will keep plugging along, and try to become more familiar with it. At the moment, I'm getting enough basic data from the dashboard I found to at least keep an eye on the systems. Even though a lot of the graphs say "no data," because they are looking for linux-specific data points, like kernel specific items from linux's /proc.

Hopefully, when this is all said and done, I'll be able to work it like a guy I knew in the army that could rebuild an engine, end up with a 3 gallon coffee can of leftover parts, and the engine ran better than before.

Thoughts? Leave a comment

Comments
  1. Fabian Ritzmann — Nov 7, 2025:

    Stumbled over your blog via the BSD Now podcast and I have been thinking exactly along your line of thought. I had actually been using Zabbix up until I recently reinstalled Debian on my monitoring server and I couldn't bring myself to reinstalling Zabbix. It's too much effort and updates are difficult. Previously, I had run Prometheus + Grafana but the effort to maintain Grafana and the inconsistencies between platforms had been what drove me to Zabbix.

    All the alternatives have major drawbacks as you know but then I recently rediscovered Monitorix: https://www.monitorix.org/ . It is easy to set up and additional hosts can simply be added by the monitoring server polling the Monitorix web service on the host. I have yet to install it on OpenBSD but it claims that FreeBSD and OpenBSD are supported.

  2. vulcanridrNov 8, 2025:

    Thank you for your post. I will look at Monitorix. It does have a lot of monitors already set up, including many for FreeBSD as well as modern hardware. Looks like I'm going to be spinning up another jail to test it...

    Another thing that really irritated me about Zabbix was that the documentation for 7.x is nearly 2200 pages, and has a very meandering, yet repetitive style to it...

  3. Fabian Ritzmann — Nov 8, 2025:

    Actually, I mentioned this to a friend this morning and he pointed me to https://beszel.dev/ . It works great and looks much better and more modern than Monitorix. It's just a single Go binary and claims to support BSD as well. It was really quick to install on two Linux machines.

    The documentation talks a lot about Docker and agents connecting to the hub (server) but the agent binary comes with a built-in SSH server that makes it just as simple to have the hub connect to the agent.

  4. vulcanridrNov 9, 2025:

    Thanks, Fabian. I actually ran across Beszel, and it is on my "check later" list. I admit that all of the references to installing in/as a docker container kind of put it in the "later" category, but reading up on it now...Actually looks promising.

    Though, so far, it looks as if the hub/server piece is geared to linux only, from what I am seeing. Even the agent hasn't seemed to have a FreeBSD specific version since 0.15.2, but the hub appears to be linux only.

    Also tried compiling. Cloned the 0.15.4 repo as #compling in the guide states. Not overly familiar with go, but compiling did nothing:

    First attempt after cloning repo:

    go clean
    go: downloading go1.25.3 (freebsd/amd64)
    go: downloading github.com/blang/semver v3.5.1+incompatible
    rm -rf ./build

    Subsequent attempts give

    go clean
    rm -rf ./build

    My sense is that it is a linux monitoring system that had some imited support for the BSDs.

  5. Fabian Ritzmann — Nov 9, 2025:

    True, I overlooked that you want to run the server on BSD as well. I am running it on Debian and just need agents for OpenBSD.

    I don't have a proper build environment on my OpenBSD machines but looking at the instructions, make build should build the hub and agent. go clean will just wipe the build results.

    Monitorix would certainly be easier to install since it is pure Perl.

  6. vulcanridrNov 9, 2025:

    I will re-read the docs, but my understanding was make build-hub built the hub, make build-agent builds the agent, and make builds both.

    Doesn't seem to matter. The hub is apparently unbuildable on FreeBSD. When I try to `make build-hub, it reports:

    go mod tidy GOOS= GOARCH= go build -o ./build/beszel__ -ldflags "-w -s" ./internal/cmd/hub internal/site/embed.go:9:12: pattern all:dist: no matching files found *** Error code 1

    Stop. make: stopped in /tmp/beszel

    If I specify the OS and architecture:

    # make build hub OS=freebsd ARCH=make OS=amd64

    go mod tidy GOOS=amd64 GOARCH=make go build -o ./build/beszel-agent_amd64_make -ldflags "-w -s" ./internal/cmd/agent go: unsupported GOOS/GOARCH pair amd64/make *** Error code 2

    Stop. make: stopped in /tmp/beszel

    However,

    # make build-agent go mod tidy GOOS= GOARCH= go build -o ./build/beszel-agent__ -ldflags "-w -s" ./internal/cmd/agent

    No apparent errors. I'll contact the devs and see if a FreeBSD hub is in the cards, or if it is a dead end.

  7. vulcanridrNov 9, 2025:

    Opened a github discussion link. We'll see what the dev has to say...