Rebuilding GNU ls in Koka
06 May 2026In a previous post, I introduced Koka and why I started porting ls to it. But I never wrote about the actual backstory.
I stumbled onto Koka in my GitHub feed and it immediately got me hooked. This sentance alone is golden – “Koka is a strongly typed functional-style language with effect types and handlers that transpiles to C11”.
Right around then, I saw a LinkedIn post where someone had built a parallel ls in modern C++ and tagged John Cricket, which led me to his coding challenges.
If you really want to learn a language you should be hands-on and do exercises, John has listed several fun and challenging exercises but ls wasn’t on the list, so that was the one I picked 🙂
The Ambition
The goal: Rebuild GNU ls in Koka. 100% byte-for-byte compatible output.
That sounded like “a weekend project, maybe two” until I opened ls.c… The quick command I run hundred times a day is a 5,000-line beast of C, 83 CLI flags, and decades of accumulated edge cases. It’s a masterpiece of over-engineering, and it’s rock solid.
Getting Into the Guts
I’m working against the 9.10 codebase, and the architecture is a four-stage gauntlet: Setup → Parsing → Execution → Output.
The setup alone is a 1,000-line jungle of globals and structs. decode_switches(), the monster that parses options, is nearly 600 lines long!
I have created a couple of notes if people want to learn more about the internals and tips on how to use GNU ls (and my version of course):
- Decoded GNU ls – The architecture walkthrough.
- Column Layout – The math behind
-C. - Exit Codes – The three states of “oops.”
- Usage Guide – Practical tips for GNU
ls. - Recursive Listing – How
-Rtraverses the tree.
The official docs and the man page are fine for users, but studying the source code is the only way to see how it really works.
The Roadmap
I drafted a plan of 6 phases + the infrastructure needed like a testing framwork and proper CI using GitHub Actions.
- Phase 1 — Foundation
- Phase 2 — Which files are listed
- Phase 3 — What information is listed
- Phase 4 — Sorting the output
- Phase 5 — General output formatting
- Phase 6 — Formatting the file names
To keep myself in check, I have built a test framework (klap) that diffs my output against the original GNU binary. CI fails immediatly if I’m out of line…
Why Koka?
ls is actually a perfect stress test for Koka’s unique features:
- The Effect System:
lsis a mix of pure logic (sorting/formatting) and messy side effects (disk I/O,stat()calls). Koka’s effects make that boundary visible in the types. - Data Modelling: File types, sort modes, format styles, indicator kinds. The enums in
ls.cmap naturally to Koka’s algebraic types. - Perceus: This is Koka’s secret weapon. It’s a reference-counting system that allows functional code to perform like C.
- The FFI: Koka’s standard library is still growing, so I’ve had to write C shims for things like symlink detection. The FFI (Foreign Function Interface) has been surprisingly painless and fun.
Learning in Public
I’ll be the first to admit: I’m a decent but not pro programmer and a total Koka novice. GNU ls is not “beginner friendly” territory. The C is dense and Koka’s documentation is still thin compared to other mainstream languages.
I’ve spent a lot of time “rubber-ducking” with a Genie to get through the nuts and bolts, figuring out why stat() and lstat() treat symlinks differently, or how -l is supposed to silently override -C.
This project doesn’t have a delivery-date. It’s just me, a 40-year-old C program, and a language that really got me hooked. It’s been a blast.
I’m not going to rebuild all of Coreutils, no way. ls is more than enough. But the process has triggered a different curiosity: I want to build a small language that transpiles to Koka. 🤓
Follow along at koka-labs.