alessandro.carminati: 2026

Tuesday, March 10, 2026

When a Homelab Migration Awakens Your Inner Networker

My blog has never been particularly homogeneous. This post is probably weirder than usual.

These days most people know me for work around Linux kernel safety and the Elisa Community. That’s the professional side. But like many engineers, I also have a homelab where I experiment, break things, and occasionally learn something useful while pretending it’s just procrastination.

This story comes from when I decided my network needed an upgrade.

And it unexpectedly reminded me why networking used to be my field in the first place.

The Pascal Incident

To explain why I even cared about this migration, we need to go back a bit.

I originally started as a Assembly-Pascal programmer.

In 1997 I had a small disagreement with my computer architecture professor about whether Pascal could be used for low-level work.

Naturally, the only rational response was to write an arcade machine emulator in Pascal.

This one: a Space Invaders emulator. Which I proudly demonstrated to the professor. I still remember that conversation.

Me: Hey professor, look what I built using Pascal. If used properly, you don’t actually need C.

To be fair, this was only partially true. I did use the Pascal compiler to build the executable, but most of the low-level work ended up written as large blocks of inline assembly inside the Pascal source. Anyway, I showed it to the professor.

The professor watched the screen for a moment.

Professor: Well yes. Nice game.

That was not the reaction I expected.

Me: Look, I didn’t make this game. Someone in Japan did… back in 1978.

At that point the professor looked mildly annoyed.

Professor: Then what exactly are you showing me today?

Which honestly left me unarmed.

I would have liked to answer: the emulator, you idiot… But I didn’t. For the record it neither improved my grade nor impressed the professor. But the experiment left me with a fascination for understanding systems by emulating them. Which later led to another emulator.

Instead of arcade hardware, I tried to emulate a Cisco router.

That project became my computer engineering master’s thesis: a Cisco 2500 router emulator, again written in Pascal… the last time before burying it forever.

Reverse engineering parts of the router firmware to understand the hardware architecture forced me to dive into routing logic…

That was basically my accidental entry point into network engineering. So yes, my networking career technically started with a router emulator written in Pascal. Which probably explains a few things.

Back to the Homelab

Fast forward a few decades.

The task that triggered this story looked simple enough:

Replace an old Cisco 3560, sitting in my lab and running almost continuously for 15 years with something newer… a Ruckus ICX-7150, and finally enter the gigabit era.

Which mostly meant replacing one relic with a slightly younger relic. The 3560 was already ancient. The ICX-7150 is not ancient yet… but it’s already end-of-sale, so it’s clearly on the right trajectory.

At first glance the CLI felt familiar. Cisco and Ruckus commands look similar enough that you instinctively think:

“Ah yes. I know exactly what this does.”

That confidence usually lasts about five minutes. After that you discover that while the commands look similar, the behavior is not. Part of the reason is historical.

This network used to be entirely Cisco. Phones, switches, everything. In that environment life was easy. Phones learned their VLAN via CDP, everything powered correctly, and nobody asked too many questions.

The moment I started replacing pieces, I discovered which parts of that convenience were standards, and which parts were simply Cisco assumptions.

Some of that I expected. Some of it I discovered the hard way.

The Troubleshooting Story

The setup was simple:

Cisco 7940 phones
SCCP firmware
Asterisk
configuration via TFTP
DHCP instead of CDP

The goal was to run the phones in a vendor-neutral network.

Initially everything worked. Phones booted and registered. Then they started doing something strange. They would run normally for a while…

disconnect… then reconnect.

Phone logs showed events like:

Code:8103

Packet captures showed:

SCCP keepalive retransmissions
TCP resets
reconnect cycles

Then another symptom appeared: occasionally the phone did not respond to ARP requests. That pointed suspicion directly at the phone. Old 79xx firmware is known to be… quirky. So the investigation went there first.

The Firmware Detour

The phones were running:

P00308000100

Which triggered the usual Cisco firmware archaeology. Finding firmware for older Cisco devices is a bit like looking for historical artifacts: you know it exists, you’re just not sure where.

Eventually a newer firmware image was located and installed.

The upgrade worked… And the problem remained.

So the firmware theory collapsed.

At this point the investigation had already consumed a fair amount of time… and it was still focused on the phone.

The Turning Point

The turning point came after a longer packet capture session. Something about the ARP traffic looked wrong. So the next step was obvious: check what the switch actually saw. A mirror session was configured on the ICX and the phone port was captured.

The mirrored traffic showed:

unicast traffic on VLAN 2
broadcast traffic from the phone

But something was missing.

The ARP broadcast from Asterisk never appeared. To confirm this, traffic was captured in several places:

the ICX mirror
a VLAN-2 host
a Cisco mirror upstream

The result was consistent. The ARP broadcast existed on the Cisco side, but never reached the ICX access port. At that moment the investigation pivoted. This was no longer a phone problem.

It was a switch problem.

Finding the Culprit

To confirm the path, the Cisco 3560 uplink was mirrored. The ARP request was clearly visible leaving the Cisco trunk:

Who has 10.1.5.57? Tell 10.1.5.1

But the broadcast never appeared on the ICX.

The packet was disappearing between the switches.

Attention turned to the ICX uplink configuration:

vlan 2
 tagged ethernet 1/2/2
!
interface ethernet 1/2/2  
 dual-mode

VLAN configuration looked correct: VLAN 2 tagged 1/2/2

Various things were tested:

VLAN membership
DHCP snooping
spanning tree
broadcast filtering

Nothing changed.

The Fix

Eventually attention returned to one small detail on the ICX uplink:

interface ethernet 1/2/2  
 dual-mode

In FastIron (Brocade/Ruckus) this is roughly equivalent to the native VLAN on a Cisco trunk.

And in fact, if you come from a Ruckus-native background, that line probably would not raise any alarms at all.

The only reason it caught my attention is precisely because I wasn’t thinking like a Ruckus engineer.

So, mostly out of prejudice rather than conviction, I tried removing it.

Immediately:

ARP broadcasts appeared
VLAN-2 hosts received them
the phone stopped disconnecting

The note I wrote at the time was simply:

was that! no dual-mode on 1/2/2 solved the problem.

And that closed the investigation.

Which strongly suggests the real root cause was not the configuration itself, but likely a firmware bug in the switch triggered by that combination of trunking behavior.

NOTE: On Brocade/Ruckus FastIron switches, the dual-mode command is used to allow a port that carries tagged VLANs to also accept and transmit untagged traffic. Unlike Cisco trunks, which always include a native VLAN by default, a FastIron port with tagged VLANs normally sends only tagged frames unless dual-mode is enabled. The command itself takes no arguments and has remained syntactically stable across FastIron versions; it simply enables the ability for the port to carry untagged frames. The actual native (PVID) VLAN is determined separately by the port’s untagged VLAN membership, which can be changed under the interface with vlan-config move untagged vlan <vid>. For engineers familiar with Cisco, the combination of dual-mode plus the port’s untagged VLAN is functionally equivalent to configuring a native VLAN on a trunk.

A Note About AI

One final note.

I have been skeptical about AI tools for quite a while and avoided using them in my workflow.

This debugging session, and a few others, are changing my mind a bit.

The AI suggested ICX commands I didn’t know, reminded me of Cisco syntax I had not used in years, and helped narrow the search space.

Without that assistance I would probably have spent days digging through documentation.

Instead the whole problem unraveled in one night.

Which, for a homelab experiment, is an excellent outcome; and significantly better than spending a weekend reading switch manuals.

Friday, February 6, 2026

Cheri & Fil-C, are they the C memory-safe solution?

Looking at the past

Pointers Were Never Meant to Be Just Numbers

(And C Didn’t Kill That Idea… We Did)

Ask a C programmer what a pointer is, and you’ll almost certainly hear:

“It’s just a memory address.”

That statement is true in practice, but historically wrong.

The idea that a pointer is “just a number” is not a law of computing. It is the result of a long chain of economic and engineering decisions that happened to align in the 1970s and 1980s.

Before that, and increasingly again today, a pointer was understood as something richer: a reference with rules, a capability, a guarded object.

And crucially: C did not invent flat memory. It merely adapted itself to it extremely well.

Before C: When Pointers Carried Meaning

Early computer systems had no illusion that memory access was harmless.

Machines such as the Burroughs B5000, the Unisys 1100/2200 series, and later the Lisp Machines all treated pointers as structured entities:

bounds-checked
tagged
validated by hardware
often unforgeable

A pointer was not an integer you could increment freely. It was closer to a capability, a permission to access a specific object.

This wasn’t academic purity. It was a necessity:

multi-user systems
shared memory
batch scheduling
safety over speed

These machines enforced correctness by design.

C Did Not “Flatten” Memory… It Adapted to It

It’s tempting to say:

“C introduced flat memory and unsafe pointers.”

That’s not quite true.

C was designed on the PDP-11, a machine that already had:

a flat address space
no hardware memory tagging
no segmentation protection at the language level

C didn’t invent this model… It embraced it.

But here’s the key point that often gets missed:

C was explicitly designed to be portable across architectures with very different memory models.

And that includes machines that did not have flat memory.

C on Non-Flat Architectures: The Forgotten Chapter

C was successfully implemented on:

segmented machines
descriptor-based systems
capability-like architectures

Including Unisys systems, where pointers were not simple integers.

As documented in historical work (and summarized well in begriffs.com – “C Portability”), early C compilers adapted to the host architecture rather than forcing a universal memory model.

On Unisys systems:

pointers were implemented using descriptors
arithmetic was constrained
bounds and access rules were enforced by hardware
the compiler handled translation

This worked because the C standard never required pointers to be raw addresses.

It required:

comparability
dereferenceability
consistent behavior

Not bit-level identity.

Even Henry Rabinowitz warned, in Portable C, that assuming pointer arithmetic behaved like integer arithmetic was already non-portable, even in the late 1980s.

So what changed?

The Real Shift: Economics, Not Language Design

The shift didn’t come from C.

It came from:

cheap RAM
fast CPUs
simple pipelines
RISC philosophy
UNIX portability

Flat memory was faster to implement and easier to optimize.

Once x86 and similar architectures dominated, the hardware stopped enforcing:

bounds
provenance
validity

And since C mapped perfectly onto that model, it became the dominant systems language.

From that point on:

pointers became integers
safety became a software problem
memory bugs became a security industry

Not because C demanded it, but because the hardware no longer helped.

The Long Detour We Are Now Undoing

For decades, the industry tried to patch this with:

ASLR
stack canaries
DEP
sanitizers
fuzzers

All useful. None fundamental.

They treat symptoms, not causes.

Which brings us back, full circle, to the idea that started this story:

A pointer should carry metadata.

A Short Detour: Even x86 Wasn’t Always Flat

Before moving forward, it’s worth correcting one more common simplification.

Even the architecture most associated with “flat pointers”, x86, did not start that way.

In real mode, x86 used segmented addressing:

physical_address = segment × 16 + offset

This meant:

pointers were effectively split into two components
address calculation wasn’t trivial
different segments could overlap
the same physical memory could be referenced in multiple ways

It wasn’t a capability system, there were no bounds or permissions, but it was a reminder that pointer arithmetic was never universally “just an integer add.”

What changed wasn’t the hardware’s ability to support structure.

What changed was that:

segmentation was seen as inconvenient
flat addressing was faster
compilers and operating systems optimized for simplicity

By the time protected mode and later 64-bit mode arrived, segmentation had been mostly sidelined. The industry standardized on:

Flat memory + software discipline

That decision stuck.

And that’s the world CHERI and FIL-C are now challenging.

Are CHERI and FIL-C fixing the problem?

The Common Idea Behind CHERI and FIL-C

At first glance, CHERI and FIL-C look very different.

One changes the CPU. The other changes the compiler.

But conceptually, they start from exactly the same premise:

A pointer is not an address. A pointer is authority.

Everything else follows from that.

The Shared Heritage: Capability Thinking

Both CHERI and FIL-C descend from the same historical lineage:

Burroughs descriptors
Lisp machine object references
Capability-based operating systems
Hardware-enforced segmentation

The core idea is simple:

A program should only be able to access memory it was explicitly given access to.

That means a pointer must carry:

where it points
how far it can go
what it is allowed to do
whether it is still valid

In other words: metadata.

The only real disagreement between CHERI and FIL-C is where that metadata lives.

Cheri: Making Capabilities a Hardware Primitive

CHERI takes the most direct route possible.

It says:

“If pointers are capabilities, the CPU should understand them.”

So CHERI extends the architecture itself.

A CHERI pointer (capability) contains:

an address (cursor)
bounds
permissions
a validity tag

The tag is critical:

it is stored out-of-band
it cannot be forged
it is cleared automatically if memory is corrupted
the CPU refuses to dereference invalid capabilities

This means:

no buffer overflows
no out-of-bounds accesses
no forged pointers
no accidental privilege escalation

And all of this happens without software checks.

The hardware enforces it.

This is not “fat pointers” in the C++ sense. This is architectural memory safety.

Importantly, CHERI preserves C semantics:

pointers still look like pointers
code still compiles
performance is predictable

But the machine simply refuses to execute illegal memory operations.

It’s the return of the capability machine, this time built with modern CPUs, caches, and toolchains.

By design, CHERI enforces only what can reasonably belong to the instruction set, leaving higher-level memory semantics to software.

FIL-C: Capability Semantics Through Compiler and Runtime

FIL-C starts from the same premise:

“C pointers need metadata.”

But instead of changing the hardware, it changes the compiler and runtime.

This choice allows FIL-C to enforce stronger guarantees than CHERI, but at the cost of changing not only pointer representation, but also object lifetime and allocation semantics.

In FIL-C:

pointers become InvisiCaps
bounds are tracked invisibly
provenance is preserved
invalid accesses trap

From the programmer’s point of view:

it’s still C
code still compiles
the ABI mostly stays intact

From the runtime’s point of view:

every pointer has hidden structure
every access is validated
dangling pointers are detected

FIL-C and CHERI start from the same idea: pointers as capabilities, but deliberately apply it at very different semantic depths.

At this point, the similarity ends.

While CHERI limits itself to enforcing spatial safety and capability integrity, FIL-C necessarily goes further. In order to provide temporal safety, FIL-C must change the object lifetime model itself.

In FIL-C, deallocation is decoupled from object lifetime: freed objects are often quarantined, delayed, or kept alive by a garbage collector. Memory reuse is intentionally conservative, because temporal safety cannot coexist with eager reuse.

This is not an implementation choice but a semantic requirement, and it has consequences that go well beyond pointer representation.

The Reality Check: It’s a Runtime, Not Just a Compiler

It is important to distinguish FIL-C from a simple “safe compiler”. Because it enforces temporal safety via an object lifetime model, it must bypass standard allocators like glibc. This means the high-performance concurrency optimizations (arenas, caches) that developers expect are gone, replaced by a managed runtime.

Furthermore, because FIL-C requires this complex runtime support, it is currently a user-space tool. Using it to build, for example, a kernel like Linux is architecturally unfeasible.

For standard userspace applications, the trade-offs can be summarized as follows:

Aspect	CHERI	FIL-C
Enforcement	Hardware	Software
Pointer metadata	In registers & memory	In runtime structures
Performance	2%-5% overhead	50%–200% overhead
Deployment	Requires new hardware	Works today

The higher cost of FIL-C is not primarily due to pointer checks, but to the changes required in allocation, lifetime management, and runtime semantics to enforce temporal safety.

CHERI makes the CPU safe. FIL-C makes the language safe.

Same Idea, Two Execution Models

CHERI and FIL-C are not competitors. They are two implementations of the same philosophy.

They both assert that:

pointer provenance matters
bounds must be enforced
safety must be deterministic
memory errors are architectural, not stylistic

They differ only in where that logic lives.

You can think of it this way:

CHERI -> capabilities in silicon
FIL-C -> capabilities in software
MTE -> capabilities with probabilities

Different tradeoffs. Same destination.

Why This Matters Now: The ELISA Perspective

At first glance, CHERI and FIL-C may look like academic exercises or long-term research projects. But their relevance becomes much clearer when viewed through the lens of ELISA.

ELISA exists for a very specific reason: Linux is increasingly used in safety-critical systems.

That includes:

automotive controllers
industrial automation
medical devices
aerospace and avionics
robotics and energy infrastructure

And Linux, for all its strengths, is still fundamentally:

A large C codebase running on hardware that does not enforce memory safety.

The Core Tension ELISA Faces

ELISA’s mission is not to redesign Linux.

It is to:

make Linux usable in safety-critical contexts
support certification efforts (ISO 26262, IEC 61508, etc.)
improve predictability, traceability, and robustness
do this without rewriting the kernel

That creates a fundamental tension:

Linux is written in C
C assumes unsafe pointers
Safety standards assume bounded, analyzable behavior

Most current ELISA work focuses on:

process isolation
static analysis
restricted subsets of C
coding guidelines
runtime monitoring
testing and verification

All valuable. All necessary.

But none of them change the underlying truth:

The C memory model is still unsafe by construction.

Why Cheri and FIL-C Enter the Conversation

CHERI and FIL-C do not propose rewriting Linux.

They propose something more subtle:

Making the existing C code mean something safer.

This matters because they address a layer below where most safety work happens today.

Instead of asking:

“Did the developer write correct code?”

They ask:

“Can the machine even express an invalid access?”

That’s a fundamentally different approach.

CHERI in the ELISA Context

CHERI is interesting to safety engineers because:

It enforces memory safety in hardware
Violations become deterministic faults, not undefined behavior
It supports fine-grained compartmentalization
It aligns well with safety certification principles

But CHERI is also realistic about its scope:

It requires new hardware
It requires a new ABI
It is not something you “turn on” in existing systems

Which means:

Cheri is not a short-term solution for ELISA, but it is a reference model for what correct looks like.

It provides a concrete answer to the question: “What would a memory-safe Linux look like if we could redesign the hardware?”

FIL-C in the ELISA Context

FIL-C sits at the opposite end of the spectrum.

It:

runs on existing hardware
keeps the C language
enforces safety at runtime
integrates with current toolchains

This makes it immediately relevant as:

a verification tool
a debugging platform
a reference implementation of memory safety
a way to experiment with safety properties on real code

But it also comes with trade-offs:

performance overhead
increased memory usage
reliance on runtime checks

So again, not a drop-in replacement, but a valuable experimental lens.

The Direction Is Clear (Even If the Path Is Long)

The real contribution of CHERI and FIL-C is not that they make C safer by rewriting it.

It is that they show memory safety can be improved by changing the semantics of pointers, while leaving existing code largely untouched.

This distinction matters.

Large systems like Linux cannot realistically be rewritten. Their value lies in the fact that they already exist, have been validated in the field, and continue to evolve. Any approach that requires wholesale code changes, new languages, or a redesigned programming model is unlikely to be adopted.

CHERI and FIL-C take a different approach. They act below the source level:

redefining what a pointer is allowed to represent
enforcing additional semantics outside application logic
turning undefined behavior into deterministic failure

In doing so, they demonstrate that memory safety can be introduced beneath existing software, rather than imposed on top of it.

That insight is more important than either implementation.

It shows that the path forward for Linux safety does not necessarily run through rewriting code, but through reintroducing explicit authority, bounds, and permissions into the way memory is accessed, even if this is done incrementally and imperfectly.

Looking Forward

Neither CHERI nor FIL-C is something Linux will adopt tomorrow.

CHERI depends on hardware that is not yet widely available and will inevitably influence ABIs, compilers, and toolchains. FIL-C works on current hardware, but with overheads that limit its use to specific contexts.

What they offer is not an immediate solution, but a reference direction.

They suggest that meaningful improvements to Linux safety are possible if we focus on:

enforcing memory permissions more precisely
narrowing the authority granted by pointers
moving checks closer to the hardware or runtime
minimizing required changes to existing code

This leaves room for intermediate approaches: solutions that do not redefine the language, but instead use existing mechanisms, such as the MMU, permission models, and controlled changes to pointer usage, to incrementally reduce the scope of memory errors.

In that sense, CHERI and FIL-C are less about what to deploy next and more about what properties future solutions must have.

They help clarify the goal: make memory access explicit, bounded, and enforceable… without rewriting Linux to get there.

Sunday, January 25, 2026

When Clever Hardware Hacks Bite Back: A Password Keeper Device Autopsy

Or: how I built a USB password keeper that mostly worked, sometimes lied, and taught me more than any success ever did.

I recently found these project files buried in a folder titled “Never Again.” At first, I thought they didn’t deserve a blog post. Mostly because the device has a mind of its own, it works perfectly when I’m just showing it off, but reliably develops stage fright the moment I actually need to log in. This little monster made it all the way to revision 7 of the PCB. I finally decided to archive the project after adding a Schmitt trigger : the component that was mathematically, logically, and spiritually supposed to solve the debouncing issues and save the day.

Spoiler: it didn’t.

Instead of a revolutionary security device, I ended up with a zero-cost, high-frustration random number generator built from whatever was lying in my junk drawer. It occasionally types my password correctly, provided the moon is in the right phase and I don’t look at it too directly. And yet… here we are.

The Idea That Seemed Reasonable at the Time

A long time ago, when “password manager” still meant a text file named passwords.txt, I had what felt like a good idea:

Build a tiny device that types passwords for me.

No drivers. No software installation. Just plug it in, press a button, and it types the password like a keyboard. From a security point of view, it sounded brilliant:

The OS already trusts keyboards
No clipboard
No background process
No software attack surface If it only types, it can’t be hacked… right?

(Yes. That sentence aged badly.)

Constraints That Created the Monster

This was not a commercial project. This was a “use what’s on the desk” project.

So the constraints were self-inflicted:

MCU: ATtiny85 (cheap, tiny, limited)
Display: HD44780 (old, everywhere, slow)
USB: bitbanged (no hardware USB)
GPIOs: basically none
PCB: single-sided, etched at home
Budget: close to zero

The only thing I had plenty of was optimism.

Driving an LCD With One Pin (Yes, Really)

The first problem: The ATtiny85 simply does not have enough pins to drive an HD44780 display.

Even in 4-bit mode, the display wants more pins than I could spare. So I did what any reasonable person would do:

I multiplexed everything through one GPIO using RC timing.

By carefully choosing resistor and capacitor values, I could:

Encode clock, data, and select signals
Decode them on a 74HC595
Drive the display using time-based signaling

It worked. Mostly. But it was also:

Sensitive to temperature
Sensitive to component tolerances
Sensitive to how long the board had been powered on
Fundamentally analog pretending to be digital

Device schematics

Lesson #1:

If your protocol depends on analog behavior, you don’t really control it.

Abusing USB HID for Fun and (Almost) Profit

This is the part I still like the most.

The problem A USB

keyboard is input-only. You can’t send data to it.

So how do you update the password database?

The bad idea

Keyboard LEDs, can they be used foranything else?

Use the keyboard LEDs.

Caps Lock
Num Lock
Scroll Lock

They’re controlled by the host. And yes: you can read them from firmware.

The result

I implemented a synchronous serial protocol over HID LEDs.

Clocked
Deterministic
Host-driven
No timing guessing
No race conditions

And surprisingly: This was the most reliable part of the whole project. It was slow, sure. But passwords are small. And since the clock came from the host, it was rock solid.

Lesson #2:

The ugliest hack is sometimes the most reliable one.

The Part Nobody Warns You About: Scancodes

The update tool was a small Linux application that sent password data to the device.

Here’s the catch:

Keyboards don’t send ASCII. They send scancodes. And:

PS/2 scancodes ≠ USB HID scancodes
Layout matters
Locale matters
Shift state matters

So the database wasn’t a list of characters. It was a list of HID scancodes.

That means:

The device was layout-dependent
The database was architecture-dependent
Portability was not free

This is one of those details nobody tells you until you trip over it yourself.

USB HID scancodes

Lesson #3:

Text is an illusion. Keyboards don’t speak ASCII.

The USB Problem I Couldn’t Outsmart

Now for the real failure. The ATtiny85 has no USB hardware.

So USB had to be:

Bitbanged
Cycle-perfect
Timed in software
Extremely sensitive to clock drift

Sometimes it worked. Sometimes it didn’t enumerate. Sometimes it worked once and never again. Sometimes it depended on the USB host.

This wasn’t a bug. This was physics.

Lesson #4:

USB is not forgiving, and bitbanging it is an act of optimism.

The Hardware (Yes, It Actually Exists)

Early prototype on breadboard

Just etched and drilled Rev 0.6 PCB

Rev 0.6 and Rev 0.7 model

Despite everything:

I built two physical units
I etched the PCB myself (single-sided)
I assembled them by hand
They worked... Most of the time.

I still have them. They still boot. Sometimes.

Repository Structure (For the Curious)

The project is split into three parts:

Repository Structure (For PCB & Schematics)

Single-layer board
Home-etched
All compromises visible
No hiding from physics

Host-Side Tool

Linux-based
Sends HID scancodes
Talks to the device via LED protocol
No ASCII anywhere

Firmware

Arduino-based
Third-party bootloader
USB bitbanging
Display driving
HID handling

What Actually Failed (and What Didn’t)

Failed

USB reliability
Display robustness
Timing assumptions
Environmental tolerance

Worked

HID LED protocol
Password logic
Conceptual design
Learning value

The irony

The part that looked insane... worked.
The part that looked standard... didn’t.

Wednesday, January 14, 2026

hc: an agentless, multi-tenant shell history sink (because you will forget that command)

For a long time, my daily workflow looked like this:
SSH into a server… do something clever… forget it… SSH into another server… regret everything.

I work in an environment where servers are borrowed from a pool. You get one, you use it, and sooner ~~or later~~ you give it back. This sounds efficient, but it creates a very specific kind of pain: every time I landed on a “new” machine, all my carefully crafted commands in the history were gone.

And of course, the command I needed was that one. The long one. The ugly one. The one that worked only once, three months ago, at 2 a.m.

A configuration management tool could probably handle this. In theory. But my reality is a bit messier.

The servers I use are usually borrowed, automatically installed, and destined to disappear again. I didn’t want to “improve” them by leaving behind shell glue and half-forgotten tweaks. Not because someone might reuse them, but because someone else would have to clean them up.

On top of that, many of these machines live behind VPNs that really don’t want to talk to the outside world or the collector living in my home lab. If SSH works, I’m happy. If it needs anything more than that, it’s already too much.

I wanted something different:

no agent
no permanent changes
no files left behind
no assumptions about the remote network

In short: leave no trace.

How `hc` was born

This is how hc (History Collector) started.

The very first version was a small netcat hack in 2023. It worked… barely. But the idea behind it was solid, so I kept iterating. Eventually, it grew into a proper Go service with a SQL backend… (Postgres for today)

The core idea of hc is simple:

The remote machine should not need to know anything about the collector.

No agent. No configuration file. No outbound connectivity.
Instead, the trick is an SSH reverse tunnel.

From my laptop, I open an SSH session like this:

a reverse tunnel exposes a local port on the remote machine
that port points back to my hc service
from the remote shell’s point of view, the collector is just 127.0.0.1

This was the “aha!” moment.

Because the destination is always localhost, the injected logging payload is always identical, no matter which server I connect to. The shell doesn’t know it’s talking to a central service… and it doesn’t care.

Injecting history without leaving scars

When I connect, I inject a small shell payload before starting the interactive session. This payload: - generates a session ID - defines helper functions - installs a PROMPT_COMMAND hook - forwards command history through the tunnel

Nothing is written to disk. When the SSH session ends, everything disappears.

A typical ingested line looks like this:


20240101.120305 - a1b2c3d4 - host.example.com [cwd=/root] > ls -la

This tells me:

when the command ran
from which host
in which directory
and what I actually typed

It turns out this is surprisingly useful when you manage many machines and your memory is… optimistic.

Minimal ingestion, flexible transport

hc is intentionally boring when it comes to ingestion… and I mean that as a compliment.

On the client side, it’s just standard Unix plumbing:

nc for plaintext logging on trusted networks
socat for TLS when you need encryption

No custom protocol, no magic framing. Just lines over a pipe.

This also makes debugging very easy. If something breaks, you can literally cat the traffic.

Multi-tenancy without leaking secrets

Security became more important as hc grew.

I wanted one collector, multiple users, and no accidental data mixing. hc supports:

TLS client certificates
API keys

For API keys, I chose a slightly unusual format:

]apikey[key.secret]

The server detects this pattern in memory, uses it to identify the tenant, and then removes it immediately. The stripped command is what gets stored, both in the database and in the append-only spool.

This way: - secrets never hit disk - grep output never leaks credentials - logs stay safe to share

Searching is a different problem (and that’s good)

Ingestion and retrieval are intentionally separate.

When I want to find a command, hc exposes a simple HTTP(S) GET endpoint. I deliberately chose GET instead of POST because it plays nicely with the Unix philosophy.

Example:


wget \
  --header="Authorization: Bearer my_key" \ "https://hc.example.com/export?grep1=docker&color=always" \
  -O - | grep prune

This feels natural. hc becomes just another tool in the pipeline.

Shell archaeology: BusyBox, ash, and PS1 tricks

Working on hc also sent me down some unexpected rabbit holes.

For example: BusyBox ash doesn’t support PROMPT_COMMAND. Last year, I shared a workaround on Hacker News that required patching the shell at source level.

Then a user named tyingq showed me something clever:
you can embed runtime-evaluated expressions inside PS1, like:


PS1="\$(date) $ "

That expression is executed every time the prompt is rendered.

I’m currently experimenting with this approach to replace my previous patching strategy. If it works well enough, hc moves one step closer to being truly zero-artifact on every shell.

Where to find it (and what’s next)

You can find the source code, and BusyBox research notes.

Right now, I’m working on:

a SQLite backend for single-user setups
more shell compatibility testing
better documentation around

injection payloads

If you have opinions about:

the ]apikey[ stripping logic
using PS1 for high-volume logging
or weird shells I should test next

…I’d genuinely love to hear them.