Claes Adamsson

Rebuilding GNU ls in Koka

2026-05-06T00:00:00+00:00

In a previous post, I introduced Koka and why I started porting ls to it. But I never wrote about the actual backstory.

I stumbled onto Koka in my GitHub feed and it immediately got me hooked. This sentance alone is golden – “Koka is a strongly typed functional-style language with effect types and handlers that transpiles to C11”.

Right around then, I saw a LinkedIn post where someone had built a parallel ls in modern C++ and tagged John Cricket, which led me to his coding challenges.

If you really want to learn a language you should be hands-on and do exercises, John has listed several fun and challenging exercises but ls wasn’t on the list, so that was the one I picked 🙂

The Ambition

The goal: Rebuild GNU ls in Koka. 100% byte-for-byte compatible output.

That sounded like “a weekend project, maybe two” until I opened ls.c… The quick command I run hundred times a day is a 5,000-line beast of C, 83 CLI flags, and decades of accumulated edge cases. It’s a masterpiece of over-engineering, and it’s rock solid.

Getting Into the Guts

I’m working against the 9.10 codebase, and the architecture is a four-stage gauntlet: Setup → Parsing → Execution → Output.

The setup alone is a 1,000-line jungle of globals and structs. decode_switches(), the monster that parses options, is nearly 600 lines long!

I have created a couple of notes if people want to learn more about the internals and tips on how to use GNU ls (and my version of course):

Decoded GNU ls – The architecture walkthrough.
Column Layout – The math behind -C.
Exit Codes – The three states of “oops.”
Usage Guide – Practical tips for GNU ls.
Recursive Listing – How -R traverses the tree.

The official docs and the man page are fine for users, but studying the source code is the only way to see how it really works.

The Roadmap

I drafted a plan of 6 phases + the infrastructure needed like a testing framwork and proper CI using GitHub Actions.

Phase 1 — Foundation
Phase 2 — Which files are listed
Phase 3 — What information is listed
Phase 4 — Sorting the output
Phase 5 — General output formatting
Phase 6 — Formatting the file names

To keep myself in check, I have built a test framework (kunit) that diffs my output against the original GNU binary. CI fails immediatly if I’m out of line…

Why Koka?

ls is actually a perfect stress test for Koka’s unique features:

The Effect System: ls is a mix of pure logic (sorting/formatting) and messy side effects (disk I/O, stat() calls). Koka’s effects make that boundary visible in the types.
Data Modelling: File types, sort modes, format styles, indicator kinds. The enums in ls.c map naturally to Koka’s algebraic types.
Perceus: This is Koka’s secret weapon. It’s a reference-counting system that allows functional code to perform like C.
The FFI: Koka’s standard library is still growing, so I’ve had to write C shims for things like symlink detection. The FFI (Foreign Function Interface) has been surprisingly painless and fun.

Learning in Public

I’ll be the first to admit: I’m a decent but not pro programmer and a total Koka novice. GNU ls is not “beginner friendly” territory. The C is dense and Koka’s documentation is still thin compared to other mainstream languages.

I’ve spent a lot of time “rubber-ducking” with a Genie to get through the nuts and bolts, figuring out why stat() and lstat() treat symlinks differently, or how -l is supposed to silently override -C.

This project doesn’t have a delivery-date. It’s just me, a 40-year-old C program, and a language that really got me hooked. It’s been a blast.

I’m not going to rebuild all of Coreutils, no way. ls is more than enough. But the process has triggered a different curiosity: I want to build a small language that transpiles to Koka. 🤓 Follow along at koka-labs.

A Time Machine for Your Working Directory

2026-04-19T00:00:00+00:00

In a previous post, I introduced the Intent Log, a way to capture the why alongside the what during development. The Intent Log targets and solves a piece of the comprehension problem. But there’s a related problem it didn’t address: the safety problem.

In Trunk-Based Development, you sync with the remote constantly. That’s the whole point. tbdflow is running git fetch and git rebase --autostash under the hood and originally I had trust that uncommitted files would stay untouched but there are occasions when this assumption breaks. If a genie (or even you) runs a git pull then the work would be gone unless you do the git stash-pull-stash pop loop.

I’ve been building a feature I’m calling WIP Guard, a (semi-)continuous, invisible safety snapshots that give you a time machine for your working directory.

The three ways work gets lost

There are three scenarios that keep developers anxious in a high-frequency integration workflow:

The buried stash. A rebase fails. Git auto-stashed your changes, but the stash is now tangled in a half-applied rebase. Less experienced developers might panic and even experienced ones will lose minutes untangling it.
The accidental overwrite. A developer runs git stash pop or git pull manually after a failed tbdflow sync. The previous working state is gone.
The context gap. You wrote a breadcrumb with tbdflow + "trying trait-based approach" twenty minutes ago. But the code that inspired that thought? It only exists in your working directory, which has since moved on. The note is orphaned from the state it describes.

WIP Guard solves all three.

How it works: `git stash create`

The key insight is that Git already has a mechanism for creating immutable snapshots of your working directory: git stash create. Unlike a regular git stash, this command:

Creates a commit object representing your index and working tree
Does not add it to the stash reflog (so it can’t interfere with your manual stashes)
Produces an immutable hash that stays in the Git object store until garbage collection (typically 14–30 days)

WIP Guard calls git stash create silently at strategic moments and stores the resulting hash in the Intent Log. It doesn’t introduce any new concepts, just using standard Git and much more manageable.

The four hooks

WIP Guard is “continuous” because it hooks into commands you’re already running. I didn’t want to add a new command, rather integrate it as helpful magic.

Hook A: The Intent Hook

Every time you record a breadcrumb with tbdflow +, the CLI now captures a snapshot alongside the note:

$ tbdflow + "factory pattern is too complex, switching to traits"
Note recorded: "factory pattern is too complex, switching to traits"
WIP snapshot: a7b8c9d0e1

That hash is stored inside the note object in .tbdflow-intent.json. The breadcrumb is a thought linked to the exact code that inspired it. You can go back to the code as it was at the moment of realisation.

Hook B: The Sync Hook

Before tbdflow sync initiates a rebase, it now does two things:

Anti-collision check. It verifies that .git/REBASE_HEAD, .git/MERGE_HEAD, and .git/CHERRY_PICK_HEAD don’t exist. If a git operation is already in progress, tbdflow halts and tells you to resolve it first. This alone prevents a whole class of “Git ghost” states.
Pre-sync snapshot. Regardless of whether you’ve added notes, tbdflow captures a safety snapshot and stores it as last_sync_snapshot in the intent log. If the rebase goes sideways, your pre-rebase state is one command away.

Hook C: The Radar Hook

tbdflow radar already scans for overlapping work on remote branches. Now it also silently captures a snapshot if your working directory is dirty and the last snapshot is more than 30 minutes old. This catches the case where a developer is heads-down coding and hours pass between syncs.

Hook D: The Undo Hook

This one is easy to overlook. tbdflow undo is the “panic button”. It checks out main, fast-forwards, and reverts a commit. That’s a destructive sequence for your working directory. If you had uncommitted changes on a feature branch when you hit the panic button, they’d be gone.

Now, tbdflow undo captures a snapshot and runs the same anti-collision check before doing anything. The panic button has its own safety catch.

The escape hatch: `tbdflow recover`

All of this snapshotting would be pointless without a clean recovery path. That’s tbdflow recover:

$ tbdflow recover --list
Available WIP snapshots:
  #     Type       Timestamp              Note                                     Hash
  ------------------------------------------------------------------------------------------
  1     intent     2026-04-18T14:15:00     trying trait-based approach              a7b8c9d0e1
  2     intent     2026-04-18T14:42:00     added error variants                     f2e3d4c5b6
  3     pre-sync   2026-04-18T15:01:00     Pre-sync safety snapshot                 e1f2g3h4i5

To restore:

$ tbdflow recover 1
Warning: This will apply the snapshot over your current working directory.
Snapshot applied successfully.
The snapshot remains available for future recovery.

Note that we use git stash apply, not git stash pop. The snapshot commit stays in the Git store. You can apply it again if needed. The snapshot is immutable which means you can’t accidentally destroy your safety net.

The lifecycle: what happens when you commit?

So what happens when you commit? If snapshots live in the Intent Log, and the Intent Log gets consumed by tbdflow commit, don’t the snapshots disappear?

Yes they do, and that’s by design. The commit is the lifecycle boundary. Once your work is committed and pushed to trunk, it’s in git history. The snapshots have served their purpose. When tbdflow clears the intent log after a successful push, it tells you:

Releasing 3 WIP snapshot(s) — your work is now in git history.
Intent log consumed and cleared.

The Git objects themselves aren’t deleted, they linger in the object store until garbage collection. But the references are gone, and that’s fine. Your work is safe in a commit now.

There’s an important subtlety here: this only happens on trunk commits. If you’re committing to a feature branch, the intent log (and all its snapshots) are preserved. The safety net stays up until your work reaches main.

The stale-branch guard

One important safety detail: if you switch branches and try to recover a snapshot from a different branch context, tbdflow warns you:

Stale intent log detected: notes were captured on 'feat/auth', but you are now on 'main'.

This prevents the subtle mistake of applying a snapshot from one feature’s context onto another feature’s working directory.

Why this matters for TBD adoption

The number one objection I hear when advocating for Trunk-Based Development is: “What if I lose my work?”

It’s a fair concern. In a branch-heavy workflow, your feature branch is your safety net. You can push half-finished work to a remote branch and it’s backed up. In TBD, your work-in-progress lives locally until it’s ready for trunk. That’s a trust gap.

WIP Guard closes that gap. Every breadcrumb you drop, every sync you run, every radar scan — they all leave behind an immutable snapshot of your working directory. Your work is preserved automatically, without you doing anything different from what you’d normally do.

The recovery path is one command, not a Stack Overflow deep-dive into git reflog and git fsck.

Try it

WIP Guard is available now. If you’re already using the Intent Log, you get snapshots for free — just keep using tbdflow + as you work. If something goes wrong, tbdflow recover --list shows you what’s available.

Give it a try: tbdflow on GitHub.

Throughput is a safety feature!

Capturing Intent Before the Commit

2026-04-11T00:00:00+00:00

In previous posts, I’ve talked about the Comprehension Crisis: the risk that as we move faster, especially with AI agents, we lose the “why” behind our code. A git diff tells you what changed. It says nothing about what was tried and rejected. It’s a record of the result, but it deletes the tries and struggles.

I’ve been exploring a new feature in tbdflow to bridge that gap: the Intent Log.

The goal is pretty simple (but ambitious). I want to capture the developer’s (or agent’s) reasoning while it’s fresh, without adding the documentation tax that usually kills these efforts.

Low friction breadcrumbs

For many of us, jotting things down is a vital mechanism for staying in flow. We need to leave breadcrumbs to maintain context as we navigate complex problems.

The problem I see isn’t the act of writing; it’s the friction of the “diary.” We don’t want to leave our environment to update a formal document or a separate ticket. We want to capture those internal realisations or “failed tries” directly in the stream of work.

To solve this, I’ve added two ways to log a breadcrumb during development:

tbdflow note              # Log a note
tbdflow +                 # Shorthand alias

With +, the friction is close to zero. No shell escaping, no context switching, no separate file to manage. It also works directly without doing task start .

The discipline

I’ve added breadcrumb instructions to both tbdflow SKILL.md and AGENTS.md, they are good instructions so same rules should apply whether you’re a human or an AI agent working in the repo:

Drop a breadcrumb whenever you change approach or reject an alternative.
Before a complex commit, there should be at least one or two breadcrumbs explaining major decisions.
Do not wait until commit time — log as you go.

The point is that reasoning captured after the fact is reconstruction. Reasoning captured in the moment is evidence. The intent log only works if the habit is there.

The lifecycle of a thought

The Intent Log follows a strict lifecycle designed to keep the repository clean while enriching the history.

Capture: The first time you use a note command, tbdflow creates a local .tbdflow-intent.json. This file is automatically added to .gitignore so it never accidentally hits the trunk.

Tasks: For more structured work, you can use tbdflow task start "Refactor auth logic". This sets a high-level context that all subsequent notes are attached to.

Integrate: When you run tbdflow commit, the CLI reads your notes and injects them into the body of the commit message, positioned before any breaking change or TODO footers.

Cleanup: Once the commit is pushed to the trunk, the local JSON file is deleted. The reasoning is preserved in git history, but the workspace is reset.

Handling branch drift

One challenge with local logging is branch switching. If you start a task on a feature branch and switch to a hotfix, your notes shouldn’t follow you blindly.

tbdflow now tracks branch ownership in the intent log. If you try to add a note on a different branch than where you started, the CLI will warn you:

Stale intent log detected: notes were captured on 'feat/auth', but you are now on 'main'.

If you explicitly start a new task on the new branch, tbdflow will rebind the existing notes to the new context. The developer is always aware of which stream their thoughts belong to.

Why this helps with TBD adoption

Context for the auditor. In a Non-blocking Review, the code is already integrated. The auditor needs to know the intent to verify whether the implementation matches the goal. Seeing that a developer tried and rejected a pattern saves the auditor from suggesting the same thing. Hopefully it turns into a knowledge-sharing moment.

Agent transparency. When an AI agent is working, it can use the + shorthand to think out loud. The human developer can see the agent’s reasoning before the final commit is pushed. It turns the agent from a black box into a transparent colleague.

Fighting cognitive atrophy. By capturing these notes, the history of the struggle isn’t lost. Anyone picking up a task or reverting a change can see the decision-making process.

The result

The Intent Log captures metadata at the point of creation. It requires zero context-switching and provides value during the post-integration audit.

It makes for a more honest git history where the “why” is just as accessible as the “what.”

If you want to see how the Intent Log formats the final commit message, tbdflow includes a --dry-run flag that prints it before execution. You can read more about why I believe dry-runs are essential for workflow tools here.

Give it a try and let me know if the workflow fits you and your team: tbdflow on GitHub.

Throughput is a safety feature!

Porting Coreutils to Koka

2026-04-06T00:00:00+00:00

I like to explore different programming languages, and I often try them out but most of them ends up as a “Hello World” in a folder I never reopen, digital fossils of a Saturday afternoon curiosity… My actual day-to-day has been settled for a while: Kotlin when I’m building for the web, Rust when I need a CLI tool. I do think about them a lot, but their syntax, semantics, and functionality don’t tickle me the same way anymore.

Then Koka came along…

Why Koka

It’s a research language from Microsoft Research. Functional, but not in the way that usually makes me bounce off after an afternoon.

The dot selection syntax was the first thing that felt right. Coming from Kotlin, I’m used to chaining: names.filter(is-hidden).sort(cmp).foreach(println). Koka lets me write like that. No inside-out nesting. Data flows left to right, which is how I think about it anyway.

But what really got me was the effect system. In most languages, a function signature tells you what goes in and what comes out. In Koka, it also tells you what the function does. A signature like fn(path) -> list says: this touches the filesystem and might throw. It’s right there on the tin and not buried in documentation.

The compiler enforces it. There is no way to sneak in a side-effect, type systems are great like that. It’s “honest programming” in a way.

Porting `ls`

I don’t learn languages from tutorials. I need a real problem. So I started koka-labs, where I’m porting ls and wc from GNU Coreutils. First up: ls.

ls looks simple. List files, print strings. But a proper implementation touches a lot:

Sorting I had to write an insertion sort to be able to reverse sort and get a feel of how Koka handles recursion.
File metadata Hidden files, permissions, timestamps. This is where I’ll hit the boundary between pure logic and actual OS interaction, that will be fun when I get to more “advanced” listing.
Perceus Koka’s reference counting system. This is what makes functional code run at speeds you wouldn’t expect from a language without manual memory management.

Working through it piece by piece, following the GNU philosophy, has been a good way to see how Koka keeps things clean while dealing with the mess of a real operating system.

I have even done some C to implement missing features in stdlib needed for ls -F. This was super-easy, I created a C file in the same directory with a couple of small functions and then it was just to import and wire it up.

// Import my C code
extern import
  c file "fs-inline.c"

// C FFI
extern is-symlink(p : string) : fsys bool
  c "kk_os_is_symlink"

What stands out so far

The effect system changes how I structure code. When side effects are visible in the types, separatng pure logic from IO becomes natural. The language makes that the easiest path.

Perceus is the other thing, functional languages have a reputation for being slow or memory-hungry. Koka’s approach to reference counting gives you immutability without the usual performance cost.

No rush

There’s no timeline on this, and no requirements. Just a compiler and a rabbit hole. I’ll keep porting tools and writing about what I find.

The repo is public if you want to follow along.

Beyond the Branch: The Social Fabric of Trunk-Based Development

2026-04-03T00:00:00+00:00

When discussing Trunk-Based Development (TBD), we often get bogged down in the mechanics: the branching strategy, the CI speed, or the revert logic. But as a colleague recently pointed out to me, moving away from Pull Requests (PRs) is a “drastic change” that impacts more than just Git history. It impacts the social fabric of the team.

The common fear is that without the “gate” of a PR, we lose our collective understanding of the repository. We worry that Seniors will lose sight of what Juniors are doing, and that the codebase will suffer from a form of cognitive atrophy.

In reality, safety in TBD does not come from the gate. It comes from moving from Gatekeeping to Continuous Inspection.

The Fallacy of the Gatekeeper

The Pull Request model assumes that awareness happens at the point of the merge. We believe that because three people looked at a 400-line diff, the team now “understands” the change.

In practise, this often creates a false sense of security. PRs frequently become bottlenecks where Seniors, overwhelmed by volume, perform a “rubber-stamp” review just to unblock a teammate. I explored this dynamic in more detail in The Pull Request Trap. This is where cognitive atrophy actually starts: when the review becomes a chore rather than a conversation.

TBD takes a different approach. By removing the block, we encourage the team to find more disciplined, continuous ways to share knowledge.

An Iteration in the Life of a TBD Team

To understand how this works, we have to look past the commands and see how the team actually interacts throughout a typical iteration of work.

In a PR-heavy world, you start your day by checking a backlog of notifications. In a TBD team, you start by looking at the stream.

The Sync: A developer runs tbdflow sync. Instead of a blind pull, the tool checks the CI status of the trunk. If it is red, they wait. This simple check prevents the “Monday Morning” frustration of pulling a broken build and spending an hour debugging someone else’s mistake.

The Radar: A Senior engineer runs tbdflow radar. They notice a Junior is currently touching a sensitive auth module. Instead of waiting for a PR two days later, the Senior reaches out immediately: “I see you’re in the auth logic; let’s pair for twenty minutes on the error handling.”

Knowledge is shared before the code is even committed, not as a post-hoc correction.

The Work: Executable Standards

TBD requires a high level of discipline. We replace the “PR Template” with an executable Definition of Done (DoD).

As the team works in small, atomic batches, tbdflow commit presents an interactive checklist. “Did you add tests?” “Is the documentation updated?” This moves the “manual checks” from a document no one reads into a CLI flow no one can ignore. The machine checks the machine, ensuring that only “Done” code reaches the trunk.

The Afternoon: The Audit Loop

This is where the collective understanding is maintained. In TBD, we practise Non-blocking Reviews (NBR).

The Senior spends thirty minutes reviewing the day’s “Digest.” They see a commit that is already live and passing tests. They notice a slight architectural misalignment. Instead of blocking the developer’s momentum, they raise a concern.

The Junior isn’t stopped. They simply integrate the Senior’s feedback into their next “fix-forward” commit. The learning loop is measured in hours, not days. The Junior grows faster because they are constantly receiving small, digestible pieces of feedback rather than a massive “Request Changes” dump at the end of a task.

Continuous Inspection over Gatekeeping

The “drastic change” of TBD is moving from a world of Checkpoints to a world of Streams.

PRs ensure people have “seen” the code, but TBD ensures people are aligned with the code. By integrating in tiny batches, the “target” for review is smaller and harder for bugs to hide in. By using tools like radar and review --digest, the team maintains a constant, peripheral awareness of the whole repository.

TBD doesn’t remove the team’s responsibility to understand the codebase. It just provides a faster, more honest way to achieve it.

Using tbdflow is entirely optional. The tool is built on standard git and gh CLI commands, and you can always perform these actions manually. If you want to look under the hood to see exactly how it works, tbdflow includes a --dry-run flag that prints every underlying command before it runs. You can read more about why I believe dry-runs are essential for workflow tools, Looking under the hood with a dry-run.

Throughput is a safety feature!

The Panic Button for Trunk-Based Development

2026-03-04T00:00:00+00:00

Trunk-Based Development is designed for speed. It removes waiting and keeps integration continuous. Code moves to the trunk quickly, often within minutes.

But fast integration has a requirement: the trunk must stay green.

Anyone who has pushed a commit to main, seen CI fail, and realised the rest of the team is now blocked knows the situation.

In TBD, the rule is simple: Fix it or revert it.

Speed only works when recovery is just as fast as integration.

The friction of manual reverts

In theory, git revert is simple. In practise, doing it while the team is waiting introduces unnecessary friction.

When the trunk is broken, you typically need to:

Make sure you are on main
Pull the latest changes so you are not reverting on a stale head
Find the correct SHA
Run the revert
Resolve any metadata or message issues
Push the change

None of this is complicated. But under time pressure, small mistakes happen. A wrong SHA, a stale branch, a revert on the wrong base.

The time saved by skipping a Pull Request can quickly turn into time spent stabilising the trunk.

Introducing `tbdflow undo`

With version 0.22, I added a simple command:

tbdflow undo

It is intentionally opinionated, the command is built for reliability.

When you run it, tbdflow:

Syncs with the remote trunk to ensure your local state is current
Verifies that the provided SHA exists on the trunk
Performs a clean revert with a conventional commit message
Pushes the revert immediately

The goal is simple: remove the friction when you need the trunk green again, fast.

Fast recovery enables fast integration

Teams are often cautious about Trunk-Based Development because the main branch feels exposed.

In practise, safety in TBD does not come from gates. It comes from fast feedback and fast correction.

If reverting is easy and predictable, the cost of a mistake drops significantly. That lowers hesitation. Smaller commits feel safer. Integration stays frequent.

tbdflow undo is a small feature, but it reinforces an important principle:

Continuous integration only works when continuous recovery is equally simple.

Version 0.22 is available now. You can read more in the GitHub documentation or explore how this fits with non-blocking reviews and post-integration audits.

Throughput is a safety feature!

The Comprehension Crisis

2026-02-22T00:00:00+00:00

When we talk about the current AI surge, the conversation almost always centres on output.

More code produced per hour. Faster delivery cycles. Higher individual throughput. The charts from major LLM providers suggest a world where the “cost of intelligence” is dropping at an extraordinary pace.

But having spent years working in and observing how systems and teams actually perform, I’m increasingly concerned that we are optimising the wrong end of the pipeline. We are so focused on output that we are neglecting comprehension: the thinking, understanding, and learning that happen before a single line is written.

When reasoning and problem-solving become cheap and instantly available, we can produce solutions faster than we can understand them. The problem is not necessarily incorrect output, but a gradual loss of understanding about why things work and where they might fail.

I’ve started thinking of this as cognitive atrophy.

The feedback loop of understanding

In DevOps terms, using AI to bypass deep thought is similar to automating a deployment pipeline without understanding the delivery system behind it. You may get a short-term increase in speed, but you weaken the feedback loops that build long-term capability.

When we consistently outsource the act of thinking, we slowly lose the ability to reason deeply about the work itself. The output may look technically correct, but our internal mental model of why it works (and where it might fail) becomes thinner over time.

AI is excellent at removing productive friction. But in engineering, friction is often where learning happens. Wrestling with a difficult bug, tracing a production incident, or working through an architectural trade-off is how intuition and system understanding are built.

If a model generates the design, the code, and even the explanation, it becomes easy to move work forward without ever really owning the logic. The system looks faster, while the capability inside the system quietly degrades.

When solutions become cheap

This is also changing what experience and seniority mean, it used to be easy to recognise seniority in the ability to implement solutions quickly and confidently. With AI support, that signal becomes weaker. The difficult part is no longer producing a solution, but deciding whether the solution makes sense in the system it will live in.

AI can generate plausible answers almost instantly, but plausibility is not the same as fit. Deciding what belongs in a particular system still requires an understanding of constraints, history, and trade-offs.

These are the kinds of decisions that teams already deal with:

deciding where to reduce batch size
choosing what to automate first
balancing short-term speed with long-term stability
understanding system constraints and unintended consequences

These capabilities don’t disappear with AI. If anything, they become the main source of advantage.

Using AI without losing comprehension

Avoiding cognitive atrophy requires being intentional about how we integrate AI into daily work. Here are a few principles that I’ve found to be useful:

Start with the problem Before using AI, articulate the problem yourself. Define constraints, risks, and desired outcomes. This mirrors understanding your value stream before optimising it. If you cannot explain the problem clearly, the solution will not be trustworthy.

Treat AI as an accelerator Let it draft, explore, and suggest. Keep ownership of structure, decisions, and logic. If you can’t explain the solution without the tool, you don’t truly own it.

Optimise for learning In DevOps we know that speed without feedback creates instability. The same applies cognitively. If throughput rises while learning falls, we are accumulating technical and human debt.

Capability and comprehension

AI will unquestionably increase delivery capacity. But faster output does not automatically translate into stronger capability. A team can move quickly while its understanding of the system gradually becomes thinner.

Adopting the tool is easy. Preserving and growing the capability to understand the result is harder, and that is where the real advantage lies.

The Pull Request Trap

2026-02-19T00:00:00+00:00

Most developers recognise the “waiting room” of software development. The code is written, the tests are green, and the change is ready. Then you hit git push, open a Pull Request (PR), and the work stops moving, and then the wait begins. For many developers, the time spent waiting for reviews exceeds the time spent writing the change.

In many teams, PRs have become the standard way of working. They are introduced to improve quality, but they also introduce waiting and hand-offs that slow integration. Over time, the workflow itself becomes the constraint. If you are trying to pass The Claes Test, especially Question 11 (CI/CD) and Question 7 (Collaboration), the pull request process may be what is holding you back.

The Hidden Cost of the “Gatekeeper”

The PR model introduces several costs that are easy to overlook but directly conflict with a high-throughput culture.

Wait time

This is pure queue time. In a PR workflow, the gap between “code complete” and “code integrated” stretches into hours or days. In a healthy TBD system, it is measured in seconds, or at least minutes.

Context switching

Developers don’t sit idle. They start something new while waiting. When the feedback finally arrives, they must drop their current work to revisit old code. Their flow is broken repeatadly, it’s a real “flow killer.”

Batching

Because the “transaction cost” of opening a PR is high, developers tend to batch more changes into a single PR. Larger batches are harder to review, harder to test, and riskier to deploy.

A false sense of safety

Many reviews end with a quick “LGTM” under time pressure. The gate exists, but are ceremonial in nature and the signal is weak. With non-blocking reviews, the code is already live, which raises the bar for real understanding rather than rubber-stamping.

The CI efficiency paradox: “Do we really test every commit?”

A common objection to true TBD is the perceived cost of running full CI on every small change. In practice, small increments are far more efficient than large batches.

Human time versus compute time

A developer’s hour costs far more than a CI runner’s hour. Pausing flow to save compute is almost always the wrong trade-off.

Continuous verification

When CI runs in parallel with development, feedback arrives within minutes. Small changes are confirmed before the next task even begins.

Reduced blast radius

When a ten-line commit fails, the fix is immediate and obvious. When a thousand-line PR fails, investigation becomes slow and uncertain.

Incremental testing

Modern pipelines can scope tests to the parts of the system that changed. Small commits allow feedback to stay fast and focused.

Why gates do not automatically create quality

The industry often assumes that more approval steps lead to safer software but DORA research consistently shows the opposite.

High-performing teams favour lightweight, fast feedback and rapid integration. Heavy approval processes correlate with longer lead times and lower deployment frequency, without improvements in stability.

Metric	PR-centric flow	Non-blocking flow
Lead time	High	Low
Context switching	Frequent	Minimal
Batch size	Large	Small
Primary quality signal	Asynchronous review	Automated checks + fast correction
DORA performance	Lower	Higher

Quality comes from fast feedback and rapid correction, not from waiting.

A practical alternative: pairing and non-blocking reviews

In tbdflow, the workflow shifts away from PRs as gates and toward continuous integration by default.

Pair or mob programming

Real-time review happens as the code is written. The four-eyes principle is built into the work itself.

Non-blocking reviews

Changes integrate immediately. CI starts at once. Review happens in parallel. If an issue is found, the team fixes forward.

Atomic commits

When integration is effortless, changes stay small. Small commits are easier to understand, easier to test, and easier to correct.

Summary: Throughput is a Safety Feature

Pull Requests are often treated as a safety mechanism. The assumption is that stopping changes before integration reduces risk.

In practice, safety comes from fast feedback and the ability to correct problems quickly. Systems that integrate continuously tend to detect issues earlier and recover faster.

Removing the PR trap does not remove quality control. It shifts quality into the daily work: small changes, fast feedback, and shared responsibility for the trunk.

That shift is less about tooling and more about how teams work together.

Locality

2026-02-09T00:00:00+00:00

In The Claes Test, I ask a critical question about collaboration: Do boundaries get out of the way so teams can solve problems together? (Question 7).

In many organisations, this box is left unticked. Progress depends on hand-offs, approvals and waiting for other teams, and what is described as “collaboration” often turns into a series of blockers. Autonomy is confused with working in isolation.

To move this from a blank to a tick, we need to understand the concept of Locality.

The Standardisation Paradox

A common pushback to Locality is the fear of chaos. If every team has total Locality, won’t they all pick different databases, different auth providers, and different UI buttons? Wouldn’t that hurt the customer experience?.

This is where we must distinguish between Locality and Isolation. Locality isn’t silofication; it is about having the authority, capability, and expertise to satisfy customer needs without being blocked by external dependencies.

I like to use an IKEA analogy here: if you are installing a kitchen, you don’t call a specialist to drill every single hole. You have the tools and the pre-drilled boards to do it yourself. That is Locality. The platform (IKEA) provides the “FIXA” toolset and the pre-measured units (the standards), but you maintain the authority to hang the cabinet yourself.

1. Locality of Authority vs. Locality of Toil

A common friction point is the belief that a team must build its own infrastructure from scratch to be “autonomous”. That isn’t Locality; that’s toil.

True Locality means decision-making stays with the team. A platform engineering team provides “guided paths”: automated, self-service workflows that amplify Locality by removing low-value plumbing work.

This maps directly to Question 11 (CI/CD). Locality allows changes to be deployed safely and frequently without waiting for manual hand-offs from an external operations or release team.

2. Developer Independence as the Metric

Locality exists when a team can own the full application lifecycle without relying on another team to carry out essential work.

If a team has to wait for a central function to provision a database or approve a firewall rule, Locality is broken. The boundary has become a blockade.

In a platform-enabled environment, the platform team treats developers as customers, building self-service products that keep teams in the driver’s seat. The platform should also provide Decision Support: as not every team is equally strong at engineering and architecture. This is where standardisation helps: the platform provides a standardied, secure, and performant X (compute, storage, database, monitoring, etc) configuration by default. It helps teams move faster by making “expensive mistakes” harder to commit.

3. Locality through Decoupling

You cannot achieve Locality in a “distributed monolith” where every minor change requires a coordination meeting across multiple teams. Locality is technically supported by a loosely coupled architecture.

This ensures the team closest to the service has everything they need: logs, metrics and diagnostics to identify and resolve issues independently (Question 9: Production Readiness). If you have to ask another team to access your own production data, you don’t have Locality.

Fixing the Environment

As I note in the Claes Test, behaviour is a function of both the person and their environment: (\(B = f(P, E)\)).

The question “boundaries don’t block progress” isn’t about collaborating harder; it’s about fixing the environment.

By building platforms that enable Locality, we remove the structural friction that prevents teams from being truly autonomous and collaborative.

Locality is about ownership. Platform engineering is what makes that ownership scalable.

Interestingly, Locality is the first of the Five Ideals of DevOps. It is not a “nice to have”; it is the structural foundation that enables flow, continuous improvement, psychological safety and real customer focus. The Claes Test is, in many ways, a practical way of measuring whether Locality actually exists inside an organisation.

If boundaries are blocking progress in your organisation, the Claes Test can help you pinpoint where Locality is breaking down.

Take The Claes Test

The Claes Test

2026-02-06T00:00:00+00:00

In a previous post, I talked about Kurt Lewin’s equation: \(B = f(P, E)\).

Behaviour is a function of the Person and their Environment.

When a team is struggling with missed “deadlines”, burnout, or shipping bugs, management usually looks at the Person. They hire “rockstars” or mandate “accountability”. They try to change the \(P\).

But changing the \(P\) is an exercise in futility. The \(E\) (the Environment) is the lever we actually control.

In 2000, Joel Spolsky gave us the Joel Test. It was a brilliant, 12-question metric for the engineering environment. But back then, the environment was mostly about tools: Do you use source control? Do you have a bug database?

Today, the tools are table stakes. Our modern bottlenecks are no longer technical; they are cultural and systemic.

I’ve spent some time thinking about what the Joel Test looks like if we apply Lewin’s Equation to 2025. I call it The Claes Test, or more formally, The Developer Culture Test.

Claes, as in /klaːs/

Beyond Intentions

Most companies “value” psychological safety. Most companies “believe” in growth. But values and beliefs aren’t tangible or directly visible. Behaviour is the real litmus test.

The Claes Test doesn’t ask what you believe. It asks what you do.

It’s divided into four environmental pillars. If you can’t answer “Yes” and provide evidence for these 18 points, your \(E\) is likely working against your \(B\).

1. The Foundation (The Safety Net)

If the environment is built on fear or financial instability, you won’t get innovation. You’ll get self-preservation.

Psychological Safety: Are post-mortems blameless? Can anyone raise a “stop the line” concern?
Market-Fair Pay: Is compensation transparently benchmarked and regularly updated?
True Flexibility: Is work measured by outcomes, or by “active” status on Slack?

2. Clarity & Alignment

Autonomy without clarity is a recipe for chaos.

The “Why”: Can every engineer explain the customer impact of their current sprint?
Visible Roadmap: Is the backlog prioritised, visible, and linked to business goals?
Direct Communication: Is there a documented process (like RFCs or ADRs) for technical dissent?
Cross-Functional Unity: Do Dev, Ops, and Product solve problems together, or “pass the ticket”?
Recognised Initiative: Is work that improves the “commons” (refactoring, tooling) rewarded?

3. Sustainable Engineering

This is where we measure the “friction” in your environment. High friction = low throughput.

Operational Readiness: Is there a mandatory “Definition of Done” for production safety?
Quality Gates: Are code reviews and automated testing non-negotiable?
Continuous Delivery: Can you deploy safely multiple times a day?
Humane On-Call: Is the rotation compensated and followed by a learning review?
InnerSource: Is the “silo” gone? Can any team contribute to any codebase?

4. Growth & Progression

If the environment doesn’t offer a path forward, the best people will find one that does.

Technical Management: Do leaders have the depth to be credible partners to the team?
Transparent Ladder: Are promotion criteria objective and public?
Parallel Tracks: Can a Principal Engineer earn/influence as much as a Head of Engineering?
Feedback Loops: Is feedback regular, peer-to-peer, and focused on growth?
Investment: Is there a dedicated budget and time for professional development?

How to Use the Test

The Joel Test was binary. You either did it or you didn’t.

The Claes Test is a mirror.

If you want to understand why your team’s behaviour isn’t meeting expectations, stop looking at the people. Run this test and be honest.

A “yes” requires evidence. “We try to do this” is a “no.”

When you find the “no”s, you’ve found the friction in your environment. Fix the environment, and the behaviour will follow.

If you want a practical way to do that, run the Claes Test with your team.

It’s designed to surface where your culture is helping, and where it’s holding you back.

Take The Claes Test

Claes Adamsson

Rebuilding GNU ls in Koka

The Ambition

Getting Into the Guts

The Roadmap

Why Koka?

Learning in Public

A Time Machine for Your Working Directory

The three ways work gets lost

How it works: git stash create

The four hooks

Hook A: The Intent Hook

Hook B: The Sync Hook

Hook C: The Radar Hook

Hook D: The Undo Hook

The escape hatch: tbdflow recover

The lifecycle: what happens when you commit?

The stale-branch guard

Why this matters for TBD adoption

Try it

Capturing Intent Before the Commit

Low friction breadcrumbs

The discipline

The lifecycle of a thought

Handling branch drift

Why this helps with TBD adoption

The result

Porting Coreutils to Koka

Why Koka

Porting ls

What stands out so far

No rush

Beyond the Branch: The Social Fabric of Trunk-Based Development

The Fallacy of the Gatekeeper

An Iteration in the Life of a TBD Team

The Morning: The Social Radar

The Work: Executable Standards

The Afternoon: The Audit Loop

Continuous Inspection over Gatekeeping

The Panic Button for Trunk-Based Development

The friction of manual reverts

Introducing tbdflow undo

Fast recovery enables fast integration

The Comprehension Crisis

The feedback loop of understanding

When solutions become cheap

Using AI without losing comprehension

Capability and comprehension

The Pull Request Trap

The Hidden Cost of the “Gatekeeper”

The CI efficiency paradox: “Do we really test every commit?”

Why gates do not automatically create quality

A practical alternative: pairing and non-blocking reviews

Summary: Throughput is a Safety Feature

Locality

The Standardisation Paradox

1. Locality of Authority vs. Locality of Toil

2. Developer Independence as the Metric

3. Locality through Decoupling

Fixing the Environment

The Claes Test

Beyond Intentions

1. The Foundation (The Safety Net)

2. Clarity & Alignment

3. Sustainable Engineering

4. Growth & Progression

How to Use the Test

How it works: `git stash create`

The escape hatch: `tbdflow recover`

Porting `ls`

Introducing `tbdflow undo`