Starting to explore OpenTelemetry

This is the first entry in what I hope will become a long thread of contributions to OpenTelemetry. Before any code lands, I wanted to write down why I'm starting — because the reasons matter to me, and because the reasons are why I'll keep going when the work gets unglamorous.

Why OpenTelemetry, and why now

My existing work — on ESA Pyxel, on GitLab core, on the LLM Workflow Router — keeps circling the same instinct: systems should be legible to the humans who depend on them. Validation catches invalid states at definition time. Schema-driven architecture makes structure explicit instead of implicit. Deterministic workflows make AI behavior auditable. These are all observability problems wearing different clothes.

OpenTelemetry is the same instinct, applied at the runtime layer. Where my Pyxel Config Lab work catches problems before a simulation runs, OTel makes it possible to see what happened while a system was running — across services, across vendors, across the messy reality of production. It's the missing third leg of the observability stool I've been quietly building on.

It's also, practically, one of the most important open source projects in the world right now. Distributed systems are the substrate that almost everything runs on, including AI. And as AI systems grow more agentic — chaining tool calls, routing between models, retrieving context, falling back across providers — the ability to trace what actually happened becomes less a "nice to have" and more a precondition for trust.

Where I'm starting

I'm beginning slowly and on purpose. My intention for the first few months is:

Read the OTel specification carefully, particularly around the Collector, semantic conventions, and the emerging GenAI conventions.
Set up a local Collector and instrument a small Python service end-to-end so I understand the pipeline from the inside before I try to improve it.
Look for "good first issue" tickets that overlap with my existing strengths — schema design, configuration validation, error messaging clarity.
Write up each contribution here, with the same care I've tried to give my Pyxel work.

I'd rather make one well-considered contribution than ten rushed ones. That's a value I've held throughout my career, and it's not something I want to drop just because the project is bigger.

What this article is not This isn't a contribution write-up yet — there's no PR to point to. It's a marker. The next article in this folder will be the first real piece of work, and I want the throughline from why to what to be visible to anyone reading.

The cybernetic argument, expanded

On the homepage of this site I introduced second-order cybernetics — the move Heinz von Foerster made when he insisted that the observer is always inside the system they study. I want to use this article to say more about why I keep coming back to that idea, and why I think the second order is where the most important work has to happen.

No system operates in isolation

The premise that grounds everything else, for me, is this: no system operates in isolation. Not a piece of software, not a research lab, not an AI model, not a person, not a society. Everything we build is embedded in something larger, and the act of building is itself an intervention into that larger thing. A Python service emits signals into a network of other services. A scientific simulation produces results that researchers act on. An AI model trained on data produced by humans is then deployed back into the lives of those same humans, who change as a result, and whose changed behavior becomes the training data for the next generation.

The clean separation between "the system" and "its environment" is always an analytical convenience — useful, but never quite true. Cybernetics, at its best, is the discipline that takes this seriously.

The human–system layer is not optional

If systems are never isolated, then the layer where humans and systems interact is not an edge case to be handled — it is the most important layer of all. It's where the loop actually closes. It's where consequences become feedback. It's where designers, users, operators, and bystanders all show up as participants in something they only partly control.

And yet that layer is so often the least carefully designed part of any system I've encountered. We pour engineering effort into the parts the computer touches and treat the human interface as a thin film stretched over the top. That's backwards. The human–system interaction layer should be the most carefully planned part, because it's the layer where everything else's consequences become real.

This is what draws me to validation, error messaging, and developer experience as a serious engineering discipline. They're not garnish. They're the parts of the system that actually meet a human being — and they deserve the same rigor we reserve for performance and correctness.

Why the second order carries the biggest learning curve

There's a tempting move available to anyone who reads about second-order cybernetics: jump straight to the third order. Third-order cybernetics studies mutually observing systems — networks of observers watching each other, societies of agents whose observations shape each other's worlds. It sounds more advanced, more total, more philosophically complete.

I think this is a mistake, and here's why.

The third order is more philosophical than empirical, precisely because when you apply the model to a whole society, you can no longer measure or generalize cleanly. Every society is comprised of irreducibly unique individuals — different histories, different bodies, different cognitions, different ways of being in the world. Any third-order model that tries to capture "society observing itself" has to either smooth over those differences (which betrays the cybernetic commitment to taking the observer seriously) or admit that the model can only ever describe a particular constellation of particular people at a particular time. The third order is real and interesting, but it tends toward poetry more than engineering.

The second order, by contrast, is where the hard practical work lives. It says: you, the designer, are inside the system you are building, and your choices shape what counts as a successful outcome. That's specific enough to act on. It applies to one engineer writing one piece of validation code. It applies to one research team setting up one benchmark. It applies to one person deciding what error message a user will see at 2am when their config breaks. The second order is where the biggest learning curve falls because it's where individual practitioners have to change how they work — and changing how you work is harder than changing what you believe.

Third-order conversations are valuable, and I don't want to dismiss them. But I think a lot of people use third-order framing to avoid doing second-order work. It's easier to talk about how society as a whole should relate to AI than to sit with the harder question of how I, specifically, in this specific piece of code I'm writing right now, am embedded in the system I'm trying to make safer.

What this means for OpenTelemetry, and for me

OpenTelemetry is a second-order project whether or not it knows it. It takes seriously that the engineers maintaining a system need to be able to see into it — that the observers are participants, that their ability to perceive what's happening is what makes correction possible at all. It's an infrastructure for making feedback loops legible. And it's been built, deliberately, as a vendor-neutral commons, so that the loop doesn't get captured by any single company's interests.

That's the kind of work I want to be part of. Quiet, careful, foundational — the kind of contribution that doesn't go viral but does, very slowly, make the world of software more humane.

More entries will follow as the actual work begins. I'm glad you're here at the start. 🌙

Status: Opening note · first concrete contribution in progress · updates will appear in this folder as work is shipped.