Using Claude Code to modernize a 25-year-old kernel driver

As a bit of background, one of my hobbies is helping people recover data from old tape cartridges, such as QIC-80 tapes, which were a rather popular backup medium in the 1990s among individuals, small businesses, BBS operators, and the like. I have a soft spot for tape media; there’s something about the tactile sensation of holding these tapes in my hands that makes the whole process very joyful, even though QIC tapes are notorious for their many design flaws. With some careful inspection and reconditioning, the data on these tapes is still totally recoverable, even after all these years.

Whenever I receive a QIC-80 tape for recovery, I power up one of my older PC workstations which has the appropriate tape drive attached to it, and boot into a very old version of Linux (namely CentOS 3.5), because this is the only way to use the ftape driver, which is the kernel driver necessary for communicating with this tape drive, allowing the user to dump the binary contents of the tape.

You see, the drive that reads these tapes connects to the floppy controller on the motherboard. This clever hack was done as a cost-saving measure: instead of having to purchase a separate SCSI adapter (the standard interface for higher-tier tape media), you can just connect this tape drive to your floppy controller, which was already available on most PCs. It can even work alongside your existing floppy drive, on the same ribbon cable! The tradeoff, of course, is that the data rate is limited by the speed of the floppy controller, which was something like 500 Kbps (that’s kilobits, not bytes).

The other downside is that the protocol for communicating with these tape drives through the floppy controller was very messy, nonstandard, and not very well-supported. It was a “hack” in every sense: your motherboard’s BIOS had no knowledge of the tape drive being connected, and it was entirely up to the end-user software to know exactly how to manipulate the hardware I/O ports, timings, interrupts, etc. to trick the floppy controller into sending the appropriate commands to the tape drive.

image

There were a small number of proprietary tools for MS-DOS and Windows 3.x/9x for dealing with these drives, and only one open-source implementation for Linux, namely ftape. Of course it is possible to use those original DOS/Windows tools to read the tapes, but it’s actually only ftape that allows us to read the “raw” binary contents of the tape, regardless of which proprietary software originally wrote it, which is why I prefer it for dumping the contents and worrying afterwards about decoding the proprietary logical formatting, and then extracting the files from it.

The trouble is, the ftape driver hasn’t been supported since roughly the year 2000, and was soon removed from the Linux kernel for this reason. This is why I’ve needed to run a painfully old version of Linux anytime I have to work with one of these drives. It would be great if ftape worked on a modern distro, with all the benefits and affordances that would provide.

***

So a couple of weeks ago, it occurred to me to make a simple request to Claude Code:

> This repository is a Linux kernel driver that communicates with legacy tape drives connected to the floppy
  controller (FDC) on the motherboard. Unfortunately, this driver hasn't been maintained for a long time, and
  can only compile under kernel version 2.4. I'd like to modernize this driver, allowing it to be built with
  the latest versions of the kernel.

● I'll help you modernize this Linux kernel driver for legacy tape drives. This is a significant task that
  will require updating the code to work with modern kernel APIs and conventions.

And after several iterations of “combobulating” and whatever else Claude claims to do, I suddenly had a kernel driver that was compiling without errors. This is because Claude is able to take the compiler output and feed it back into itself, until the compilation works correctly. There was a laundry list of kernel functions and structures that were understandably deprecated or replaced, in the eternity that elapsed between kernel version 2.4 and version 6.8. Amazingly, Claude found all the outdated bits and replaced them with the correct modern equivalents, requiring just a few manual touch-ups of the code (a bit more on this later).

At this point, however, the kernel driver was still expected to be compiled as part of a full kernel tree, whereas I just wanted it to become a standalone loadable kernel module. OK, no problem:

> Is there a way to compile just this module in-place, instead of copying it into a kernel source tree?

● Yes! You can compile kernel modules out-of-tree without copying them into the kernel source. Let me create
  a proper standalone build system for the ftape driver.

…And it did. Without any further prompting. At the end of this stage, I had a loadable kernel module (.ko) that I could start trying to use with actual live hardware. Unsurprisingly, the module did not yet “work” — it was loading, but something was preventing it from communicating with the tape drive correctly. But I pressed on.

From this point forward, since loading/unloading kernel modules requires sudo, I could no longer let Claude “iterate” on such sensitive operations by itself. Luckily, the ftape driver was already generously sprinkled with logging messages (printk() calls that are printed to dmesg), so I loaded the module myself, and iteratively pasted the output of dmesg into Claude manually, and let Claude compare it to a “known-good” dmesg log that I’d previously saved from a successful tape read. After a few iterations of this, Claude identified the remaining issues preventing the module from communicating with the hardware.

● Perfect! The module has parameters that can be set. The issue is that without configuration, all the base
  addresses default to -1, which gets converted to 0xffff in the config function, causing the detection to fail.
● The "No such device or address" (ENXIO) error occurs because the ftape-internal module is not configured
  with the correct I/O port base address for your floppy controller.
  The issue is in fdc-internal.c:735-737: if fdc->sra == 0xffff, it returns -ENXIO. Since no configuration was
  provided, all values default to -1, which becomes 0xffff....

And at last, I had a kernel module that was able to a) load, b) detect the tape drive, and c) dump the contents of a test tape!

image

***

Using the ftape driver on a modern kernel, a feat that I thought was hopelessly out of reach, was suddenly completed over the course of two evenings.

As a giant caveat, I should note that I have a small bit of prior experience working with kernel modules, and a good amount of experience with C in general, so I don’t want to overstate Claude’s success in this scenario. As in, it wasn’t literally three prompts to get Claude to poop out a working kernel module, but rather several back-and-forth conversations and, yes, several manual fixups of the code. It would absolutely not be possible to perform this modernization without a baseline knowledge of the internals of a kernel module.

This led me to crystallize some thoughts on working with such coding agents in our current moment:

Open yourself up to a genuine collaboration with these tools.

Interacting with Claude Code felt like an actual collaboration with a fellow engineer. People like to compare it to working with a “junior” engineer, and I think that’s broadly accurate: it will do whatever you tell it to do, it’s eager to please, it’s overconfident, it’s quick to apologize and praise you for being “absolutely right” when you point out a mistake it made, and so on. Because of this, you (the human) are still the one who must provide the guardrails, make product decisions, enforce architectural guidelines, and spot potential problems as early as possible.

Be as specific as possible, making sure to use the domain-specific keywords for the task.

I’m not claiming to suddenly be an expert in prompt engineering, but the prompts that I’ve found to be most successful are ones that clearly lay out the verbal scaffolding for a feature, and then describe the gaps in the scaffolding that the LLM should fill in. (For some reason the image that comes to mind is one of those biological stem-cell scaffolds where an artificial human ear will grow.)

Develop an intuition for the kinds of tasks that are “well-suited” for an agent to complete.

These agents are not magical, and can’t do literally everything you ask. If you ask it to do something for which it’s not well-suited, you will become frustrated and prematurely reject these tools before you allow them to shine. On this point, it’s useful to learn how LLMs actually work, so that you develop a sense of their strengths and weaknesses.

Use these tools as a massive force multiplier of your own skills.

I’m sure that if I really wanted to, I could have done this modernization effort on my own. But that would have required me to learn kernel development as it was done 25 years ago. This would have probably taken me several weeks of nonstop poring over documentation that would be completely useless knowledge today. Instead of all that, I spent a couple of days chatting with an agent and having it explain to me all the things it did.

Naturally, I verified and tested the changes it made, and in the process I did end up learning a huge amount of things that will be actually useful to me in the future, such as modern kernel conventions, some interesting details of x86 architecture, as well as several command line incantations that I’ll be keeping in my arsenal.

Use these tools for rapid onboarding onto new frameworks.

I am not a kernel developer by any stretch, but this particular experience ignited a spark that might lead to more kernel-level work, and it turns out that kernel development isn’t nearly as difficult as it might sound. In another unrelated “vibe-coding” session, I built a Flutter app without having used Flutter before. If you’re like me, and your learning style is to learn by doing, these tools can radically accelerate your pace of learning new frameworks, freeing you up to do more high-level architectural thinking.

***

In any case, circling all the way back, I am now happy to say that ftape lives on! Twenty-five years after its last official release, it is once again buildable and usable on modern Linux. I’m still in the process of making some further tweaks and new feature additions, but I have already verified that it works with the floppy-based tape drives in my collection, as well as parallel-port-based drives which it also supports.

image

The physical setup looks very similar, but the OS is now Xubuntu 24.04, instead of CentOS 3.5! 🎉
Until next time!

DiskDigger + Avalonia UI: a success story

Up until this point, DiskDigger has been built for the .NET Framework, with a user interface that uses Windows Forms. I’ve written before about my feelings on WinForms — that it’s a perfectly good, tried-and-true technology which, as ancient as it is, seems to have outlasted numerous other UI toolkits, for the simple reason that it just works.

image

However, it has bothered me for a long time that DiskDigger is not as cross-platform as I would like it to be. Sure, there was the excellent Mono project, with its independent implementation of Windows Forms that bridged the gap somewhat, and allowed DiskDigger to run on Linux, and perhaps even on macOS (albeit only on older 32-bit versions).

image

But recently, I decided to roll up my sleeves and take a serious look for a solution that would make DiskDigger truly cross-platform. I had the following rough requirements in mind when looking for a potential framework:

  • The framework should be built on .NET, since I’d like to reuse all of the business logic of DiskDigger, which is written in C#.
  • The toolkit of UI components should allow me to match the existing UI of DiskDigger without too much hassle.
  • The final output should ideally be a single, self-contained executable with no dependencies. There should not be any need for the user to “install” or “uninstall” anything. Installing should simply involve downloading and running the executable, and uninstalling should consist of deleting the executable when no longer needed.

The fact is, years ago I recall going through a similar process of investigating a cross-platform solution, but none of the frameworks I could find at the time seemed to be mature enough for me to commit to, so I kept putting it off, until an unpardonably long time afterwards, but better late than never.

Avalonia immediately jumped out as a strong contender. It is truly cross-platform, in the sense that you “ship the platform” along with your executable. This necessarily means that it will increase the size of the final executable, but I can deal with a moderate amount of bloat, as long as the end result is not like the monstrosities built with something like Electron, which need to ship the entirety of Chromium as their runtime, and make the final product into a 200MB behemoth. On this dimension, Avalonia performs relatively well: the final self-contained executable is about 60MB, and actually compresses nicely to a 30MB zip file for distribution.

.NET itself has also made its own strides in being able to bundle your app into a single executable, with a single command:

dotnet publish -c Release -r linux-x64 --sc -p:PublishSingleFile=true

…where the relevant parameters are --sc for “self-contained”, and the self-explanatory PublishSingleFile=true.

In terms of building your user interface, Avalonia seems to be a spiritual successor to WPF, and I’m embarrassed to say that I’ve never actually used WPF, either professionally or personally. It was an entire era of Windows UI development that I’d skipped over entirely. Because of this, I was a bit worried about the learning curve I’d have to endure to jump from Windows Forms directly to Avalonia. But my fears were unfounded: it didn’t take long at all for everything to “click”, because Avalonia encourages and expects you to use good architectural patterns like view models and data bindings, which I basically already had in place.

But here was the most pleasant surprise of all:
In my day-to-day work, I switch between my main workstation that runs Windows, and my MacBook Pro, and I’m able to work on the same projects for Android and the web on both machines. So I wondered how easy it would be to keep developing DiskDigger with Avalonia on my MacBook, instead of always having to develop it on my Windows PC.

I downloaded Rider (the JetBrains IDE for working with .NET) on my MacBook, installed the Avalonia plugin, and opened my Visual Studio solution. And to my amazement, it just worked! The UI designer, the code completion, everything worked flawlessly, dare I say even better than Visual Studio itself. I was able to keep developing the Visual Studio solution, unmodified, on my Mac. At this point I was convinced that this was the right direction to go in.

image

After plenty of help from the Avalonia documentation, and a little further help from Claude, I proceeded to rebuild one screen after another, until finally I had an MVP of the whole thing, with the whole process taking around four weeks. And at long last, may I present an experimental (but fully functional and complete!) version of DiskDigger, built with Avalonia UI, which can run not only in Windows, but in Linux and macOS!

image

Once again, this is just a Beta version so far, but it is perfectly usable on any platform, and I encourage you to try it and let me know your feedback. For now, I will continue development of this new Avalonia-based edition of DiskDigger in parallel with the existing WinForms-based version, which will still be the “recommended” version for Windows. But in the longer term, I can absolutely envision focusing solely on the Avalonia version, and letting the WinForms version ride into the sunset.

Software update roundup, February 2025

Time to mention some great updates in the newest versions of DiskDigger, as well as its cousin FileSystemAnalyzer, which is my “internal” tool for tinkering with various file systems.

ZFS

I’ve been embarking on a bit of self-study of the ZFS file system, specifically its on-disk structures, for the purpose of forensic analysis and opportunities for data recovery. DiskDigger (and FileSystemAnalyzer) now supports ZFS partitions on physical disks and disk images. At the moment it can only parse the current disk’s worth of the ZFS pool, i.e. it does not yet fully support pools that span multiple disks, but this will be updated soon.

FileSystemAnalyzer now lets you visualize and parse a ZFS partition in a couple of unique ways. First, you can select the uberblock from which to start parsing the file system. ZFS uses a round-robin list of uberblocks, where every new “transaction group” causes a new uberblock to be written or updated, and the uberblock with the highest transaction group number is the “active” one. This implies that “older” uberblocks could potentially point to file system structures that are deleted, or forensically interesting in other ways. FileSystemAnalyzer presents the list of potentially parseable uberblocks as separate “partitions”, which you can select:

image

I have definitely observed cases where deleting files causes a new “transaction group” to be recorded, which creates a new uberblock; and then parsing from the previous uberblock allows the deleted files to be seen.

Then when viewing the actual files in the ZFS partition, FileSystemAnalyzer lets you browse the raw “object list”, which is a flat list of objects that represent all the files and directories organized in the file system tree, but might also include items that are not present in the tree.

image

As an aside, although the design of ZFS is mostly very sound, I’m a bit taken aback by the complexity of certain portions of ZFS. For example, ZFS uses at least four different ways of storing key-value pairs, each for different purposes:

  • nvlist, for storing header metadata at the beginning of the file system.
  • SA (system attributes), for storing attributes for files and directories. Because ZFS is designed to be maximally versatile and cross-platform, the types of attributes associated with files and folders can be defined dynamically in a system-attribute registry stored in the metadata of the filesystem.
  • ZAP: this is the main storage mechanism by which the actual filesystem (files, directories, symlinks, etc) are structured, achieving a B-tree-like structure. However (!) ZAP offers two different ways of structuring it:
    • Microzap: When there are few enough entries, the structure becomes completely different, and more compact.
    • Fatzap: The actual, full-fledged mechanism for storing the file system tree.

UFS / UFS2 / Minix / Ultrix / Xenix

…Really, all Unix-like file systems. Or, I should say, all inode-based file systems. DiskDigger and FileSystemAnalyzer now have expanded support for more obscure and legacy Unix-like file systems, like Ultrix and Xenix, some of which have different versions that are mutually incompatible. If you have disk images of these types of very old operating systems, send them to me! I’m always looking for obscure stuff to test with.

Pick R83

The Pick Operating System, created by a guy whose actual name was Dick Pick, was a super interesting blip in computing history. Everything about this operating system is database-driven, including the file system, if you can call it that. The “account” that a user signs into is really a database, a “file” is really a table in the database, and then each file contains “attributes” and “records” that correspond to columns and rows. The Pick OS found its way into some niche markets, but obviously didn’t last against its larger competitors. It was, however, a bit ahead of its time with its database-centric design, and these ideas are arguably “coming back” in the form of NoSQL.

I’d like to add more meaningful support for Pick partitions in the future, but for now, you can in fact browse a Pick file system in a minimal way, and also see a list of raw on-disk “frames” that make up the Pick database:

image

Brain dump, October 2024

Macintosh PowerBook 100

Restored another vintage laptop! This time, the patient is a Macintosh PowerBook 100, which came from a dear friend of mine who allowed me to restore it after recovering the data from its hard drive. The PowerBook 100 was clearly intended to be a “lower-end” model (even though it still had a price tag of $2,500 when it was launched in 1991), with a very minimal design, cheaper-feeling plastic, small monochrome 640×400 LCD display, and no built-in floppy drive. The upshot is that the inexpensive no-nonsense construction allowed for a fairly easy restoration!

image

When I attempted to power up the laptop as-is, it just made a few crackling noises through the speaker, and not much else. Otherwise it appeared dead. Time to open it up!

The top and bottom halves of the plastic casing are held together by three screws (!), and once these are removed, the entire thing pops open effortlessly.

Looking closely at the humble motherboard, I see the potential culprit right away: failed capacitors that have leaked and corroded. Hopefully the extent of the corrosion is minimal and didn’t affect any of the chips or other components besides the capacitors themselves. I’m hoping this could be as easy as re-capping the board, i.e. replacing the capacitors.

image

I proceeded to remove all the capacitors that had the slightest indication of corrosion, that is, any capacitor whose solder joints didn’t look totally pristine and shiny. And after removing each one, I cleaned the surface of any residue with alcohol, and then installed a new capacitor with the same value. In all, I replaced 10 bad caps, all of which were a small surface-mount variety, and were either 10µF/16V or 1µF/50V. My new caps are a bit longer than the old ones, so I oriented them horizontally on the board:

image

And, after reconnecting the display and keyboard back onto the motherboard, let’s try applying power again:

image

Hey presto, it’s alive!
A bit more cleaning of dust under the keyboard, removing the gunk from inside the trackball mechanism, and a general wipe-down of the exterior, and we’re ready to reassemble!

image

And there we have it, a lovingly restored PowerBook 100, with 2 MB of RAM, running System 7.0.1. The only unusual thing about it is the hard drive, which is a whopping 1 GB! This was clearly an upgrade from whatever hard drive it had originally (probably something like 40 MB), which must have been installed many years after it was purchased. This implies that the user of this laptop got quite a lot of mileage out of it, probably well into the late 1990s or even 2000s, which makes me happy.

Finally, to round out this restoration, let’s remove this hard drive and replace it with a CompactFlash card, preloaded with tons of vintage games and apps for the Macintosh.

image

The hard drive interface on the laptop is technically SCSI, so the original hard drive must have been a SCSI drive. The newer 1GB drive is an IDE drive, and came with an adapter board that fits underneath the drive, which translates between SCSI and IDE. This is quite convenient, since we can now plug in a cheap IDE-to-CF adapter, with a generously large CF card that will become our new hard drive. At last, let the retro gaming commence.

image

MC-3020-Extra tapes!

Recovered data from several MC-3020-Extra and QIC-3020 tapes. These “Extra” tapes have the same front-facing “interface” as their smaller QIC-3020 cousin, except these are larger length-wise, allowing for larger spools inside the cartridge, and therefore a higher data capacity. Of course these cartridges have the same fatal flaw as all other QIC tapes, which is the flimsy tension belt inside the cartridge that drives the motion of the spools. This belt is virtually guaranteed to fail over a long enough time, and since these “Extra” cartridges have even more moving parts inside, they are even more prone to failure.

image

image

This batch of tapes was in particularly rough shape: the tension belt in each tape was broken, and was also adhered to the surface of the tape medium. This was likely because the tapes were stored under excessive heat or humidity, which will cause the belt to break down and react with the tape itself. This required pretty extensive cleaning of all the gunk and pieces of belt that were stuck onto the tape.

Fortunately the tapes had been rewound properly, and the damaged portions of the tape were at a spot that was “beyond” the data area of the tape. After throwing on a fresh tension belt, and using one of my trusty Iomega Ditto drives (compatible with a wide range of this family of cartridges), I was able to dump and decode 100% of the data from them.

(As always, get in touch if you have any kind of vintage tapes or other media that you’d like recovered.)

Brain dump, September 2024

Custom IDE adapter for CP-4021 drive

As part of a special data recovery project, I needed to read the contents of the hard drive from an ancient Compaq LTE 8086 laptop. This hard disk, which is a Conner CP-4021, is quite an oddball. It has an unusual form factor: it’s a 3.5″ drive, but it’s slimmer and shorter than a regular drive, as if it was specifically made to fit into the LTE laptop.

image

Perhaps most annoyingly, the IDE connector on the drive is a very nonstandard half-pitch connector, where the pins have a horizontal pitch of 1.27mm, and a vertical pitch of 2.54mm. Inside the LTE laptop, this connects to a twin ribbon cable that goes directly to the motherboard, which basically means I have no way of connecting this drive to a “standard” modern IDE controller. (And it’s definitely an IDE drive, since the connector has 44 pins, just like any other laptop hard drive, and it has the same missing “key” pin as a standard IDE connector.)

Time to build a custom connector! After a whole lot of searching, I found a blank 50-pin header on AliExpress that should match the pitch of the drive’s connector. The 50-pin header will overshoot the 44-pin connector by a few pins, but it should still fit without issues. In an ultimate test of my fine motor skills, I soldered a spare 44-pin ribbon cable onto this 50-pin header, checking painstakingly that each pin on the female end matches the corresponding pin on the male end. To secure the delicate soldering onto the header, I covered it with clear epoxy, and let it harden overnight. And just like that, I have an adapter for connecting this ancient drive to a modern PC:

image

I then connected the drive to my workstation PC, and was encouraged when the drive spun up, and the PC detected the drive successfully! However, the PC didn’t seem to be able to read any actual data from the drive. After a good bit of head-scratching, and double-checking the continuity of all the pins of my adapter, and trying to connect the drive to a few other PCs, I had a last-ditch idea to rule out stuck heads or a locked spindle, which were rather common problems with older drives. I removed the top cover of the drive, exposing the heads and platter, and as the drive was spinning up, I gently turned the spindle manually away from its resting position. And wouldn’t you know it — this caused the spindle to come alive, and the drive became fully functional!

I was able to acquire an image of the drive in Linux with minimal effort (not a single bad sector!), and I’ll be keeping my fancy custom adapter in case I come across another drive like this in the future. I did a few more random Frankenstein experiments with the drive, including booting another vintage laptop from it:

image

…and connecting that laptop to another laptop via INTERLNK.EXE, which allows the C: drive on the “server” laptop to map as the D: drive on the “client” laptop, with the goal of transferring files from one to the other, or even dumping the entire partition, which I did with the SAVEPART tool.

image

image

The connection between the two laptops is a serial cable that I cobbled together from whatever I had on hand, which turned out to be an annoyingly short cable and, thankfully, a null-modem adapter, necessary for communication between the two serial ports. This resulted in a rather slow connection between the laptops; a parallel connection would be significantly faster, but I don’t have the appropriate cable. The INTERLNK tool (bundled with MS-DOS) worked just fine, and automatically detected the connection over the COM port.

T48 chip programmer, finally

I splurged on a proper chip programmer, the T48 by XGecu. Even though there is already a newer model of their programmer (the T56), the extra cost didn’t justify the few additional chips and features it supports, at least for me.

Along with the T48, I purchased a batch of random EEPROM chips for some initial testing and verifying of the T48 itself, and possibly for actual use in future projects. These are Winbond W27C512 chips, which have a rather unusual erase voltage of 14V and programming voltage of 12V, which will be a good exercise for the chip programmer.

image

The software for the T48 is for Windows only (there is an open-source alternative that supports an earlier model of this programmer (the TL866II), but it still has very limited support for the T48). And because the software is for Windows only, and especially because it requires a special driver which can only be installed with elevated privileges, I prefer to run it in a virtual machine, for reasons that I hope are obvious. Fortunately this is done very easily in VirtualBox, which supports USB passthrough effortlessly. Here it is, running in a Windows 7 (32-bit) VM, and communicating with the T48:

image

Since my batch of Winbond chips was suspiciously cheap, I assumed they were not “new”, but rather pulled from existing boards. And I was not mistaken: reading the chips, which worked absolutely fine, revealed that they already had contents in them:

image

And then, erasing and reprogramming the chips also turned out to be a breeze, which gives the T48 programmer a thumbs-up from me. As a special bonus, I removed the firmware chip from the Conner CP-4021 drive (mentioned above!) and was able to read the firmware, using one of the myriad adapters that were included with the T48 programmer.

image

The above chip is a Microchip 27C256 (32KB), in a PLCC-32 socket. Again, the T48 was able to read this 35-year-old chip without any issues, which makes me look forward to reading and programming many vintage and newer chips in the future.