More huge updates to DiskDigger and FileSystemAnalyzer

I’ve finished some major updates to DiskDigger, as well as its companion tool FileSystemAnalyzer, to support a few more filesystems, some more obscure than others!

ReFS support

One of these filesystems represents a serious and substantial update: DiskDigger now has expanded support for ReFS, the Resilient File System introduced in recent versions of Windows Server and Windows Enterprise editions. ReFS remains totally proprietary and undocumented, so it required quite a bit of reverse-engineering to nail down the structures that it uses. I’m happy to report that DiskDigger now supports versions of ReFS starting from 3.0 (introduced in Windows Server 2016) through the very latest version 3.12 (in the latest insider build of Windows 11 Enterprise).

image

To be clear, DiskDigger had already been able to recover data from ReFS partitions by performing a heuristic (carving) search, which is independent of the actual filesystem on the disk. But now that it understands the data structures of ReFS, it can employ additional specific techniques to recover files more accurately from such partitions.


And on a lighter, more whimsical note, DiskDigger and FileSystemAnalyzer now support two other filesystem types that you’ll likely never encounter in everyday life:

RedSea filesystem

The RedSea filesystem was created by the late Terry Davis as part of his TempleOS operating system. If you’re not familiar with TempleOS, it’s an interesting rabbit hole to delve into. Literally an entire operating system built by a single person over the course of many years, TempleOS is intended to be “god’s third temple” in the form of an operating system, due to the guiding principles behind the operating system that Davis believed he was receiving from god. These principles are largely based around simplicity and purity, which is something that even the most hardened atheist like myself can appreciate. There is an expansive volume of videos in which Davis provides tutorials and explains the various features and design choices of TempleOS.

Terry was a troubled soul: he was living with uncontrolled schizophrenia which led to his eventual demise, and his videos occasionally contain some bizarre and horribly racist commentary, all of which make him more pitiable than admirable as a person. However, he was an undeniable savant at building an operating system, and I will defend the idea that we can learn something from his kernel, his compiler, and his insistence on simplicity. As a tribute to his work, I’m including support for the RedSea filesystem in DiskDigger and FileSystemAnalyzer.

The RedSea filesystem is, in many ways, the simplest filesystem possible:

  • All files are contiguous! There’s no concept of fragmentation.
  • There are no B-trees, no journaling, no symbolic links, no encryption, etc.
  • There’s no concept of clusters; block sizes are the same as sector sizes, i.e. 512 bytes.
  • Directory blocks are just a sequential list of directory entries.
  • For determining where to write new files, there is simply an allocation bitmap, where each bit represents whether the corresponding block is allocated.

image

One other interesting feature of the RedSea filesystem is that it performs a sort of semi-automatic compression of files, using a form of LZW compression. If you give a file a name that ends with a “.Z” extension, it will be compressed when it’s written to the disk, and then uncompressed the next time it’s read from the disk (transparently to the user). This compression is also supported in DiskDigger, i.e. files recovered from a RedSea partition will be automatically uncompressed.

Commodore 64 disk images

As another fun diversion, I also added support for Commodore 64 disk images (D64 files)! The file system on these disks is thoroughly documented, and is also very simple: files are represented as a linked list of blocks (a primitive “block chain”, as it were). If you have these disk images lying around, you can now peruse their contents!

image


As with all filesystems supported by FileSystemAnalyzer and DiskDigger, these additions are read-only, since these are intended to be tools for forensic analysis, and not intended for two-way interoperability.

Updates to DiskDigger and FileSystemAnalyzer, October 2023

Usually I post updates about DiskDigger on its own website, but my most recent round of updates merits a slight technical digression.

Previous versions of DiskDigger and FileSystemAnalyzer have already had basic support for 4K-native disk drives, i.e. drives that have 4 KiB sectors instead of the usual 512 bytes. However, only recently have I been able to test this support more thoroughly, fixing a few bugs along the way. 4K-native drives have been around for a while, and in fact most modern drives already use 4K sectors natively under the hood, but simply emulate 512-byte sectors to the outside world. However, increasingly we’re seeing more drives that no longer emulate 512-byte sectors (exposing the native 4K sectors to the operating system), as well as users who are opting to reconfigure the firmware of their drive to use 4K sectors instead of 512-byte emulation. DiskDigger and FileSystemAnalyzer can now handle all of these cases when mounting and searching file systems that might be present on such disks (FAT, NTFS, ext4, etc).

I did most of my testing and experimenting using a real 4Kn drive, but some testing I did with emulated disk images. Here is how you can configure qemu to treat a disk image as a 4Kn drive:

qemu-system-x86_64.exe -machine q35 -m 8G -boot d -cdrom "linux.iso" -drive file=mydisk.vdi,if=none,format=vdi,id=D24 -device nvme,drive=D24,serial=1234,logical_block_size=4096,physical_block_size=4096

The above example boots qemu from an ISO file, which can be a Linux live DVD, and makes the hard disk become a NVMe device, which allows us to configure its physical and logical block size, which we set to 4096. Linux should detect this NVMe device automatically, which will then let you create partitions and file systems on it for experimentation.


The other interesting update has to do with ancient retro file systems that are supported by FileSystemAnalyzer (and by extension DiskDigger). By coincidence, I’ve been contacted by multiple people in a short span of time regarding recovering data from Xenix file systems which they’ve saved as binary disk images. One image is from an Intel System 320 Multibus System owned by Herb Johnson of retrotechnology.com, and another is from an owner of an Altos 586 system in New Zealand.

image

Each of these images used a slightly different version of the Xenix file system, each of which use a different structure for their superblock (and each of which is different from the Xenix/SysV support that’s built into the current version of the Linux kernel). This took a bit of effort to reverse-engineer, but ultimately wasn’t too difficult to crack and integrate into FileSystemAnalyzer. The nice thing about dealing with very old data formats is that they’re usually very simple, not to say primitive. Best of all, these Xenix images contain C header files that actually describe their own filesystem structure (can I call them eigenheaders?), which I was able to use for refining and solidifying support for these file systems.

image

I even learned something else that was new to me: in addition to little-endian and big-endian byte orders, there’s also something called “middle-endian” or “PDP-11-endian”, where 16-bit values are stored in native little-endian order, but 32-bit long integers are composed of two 16-bit words in big-endian order (while the numbers in both 16-bit halves are still little-endian). This was the encoding used by the PDP-11 system, and apparently also by the Altos 586 system which was running this version of Xenix. All of these variations are now supported in FileSystemAnalyzer.

Brain dump, September 2023

I finally did something I’ve been meaning to do for a long time: get the final version of the ftape driver to work on a Linux distro that I can use in my data recovery workstations. This is for the purpose of using Linux to dump the contents of QIC-80 and similar tapes, using “floppy tape” drives, i.e. tape drives that connect to the floppy disk controller on the motherboard.

Up until this point, I’ve been using an old version of Ubuntu that has ftape pre-packaged into the kernel. The problem with this is that this version of ftape is not the latest. Development of ftape seemed to continue independently of the version that was included with the kernel. And the “last” version of ftape that is available (version 4.04a, from around July 2000) contains many enhancements over the version that was in the kernel, which seems to be 3.04, specifically compatibility with parallel port tape drives such as the Iomega Ditto 2GB.

This meant that I needed to compile the driver from source. Sounds simple enough; the driver is just a couple of loadable kernel modules. However, I would need to compile it for a version of the kernel that can boot nicely on my workstation. Browsing the source code of the driver, it appears to be intended to be compiled for kernel version 2.4.x. As an amateur kernel hacker in a previous job, I knew that even patch version changes (the third version number) in the kernel can break compilation of custom kernel modules. So, I tried to find a Linux distro that uses the earliest possible patch version of the 2.4 kernel, and still runs well on my workstation.

image

CentOS 3.5 to the rescue! I was able to find ISO installation media that I used to install CentOS 3.5 flawlessly onto my recovery workstation. It uses kernel version 2.4.21, which still turned out to be “too new” for compiling ftape successfully. I got a number of compilation errors, but thankfully they were all errors that were comprehensible and easy to remedy by an amateur. After just a few hacky modifications, I got the driver to compile into a loadable module!

And would you look at that – it’s able to communicate successfully with all of my floppy tape drives, as well as my parallel port Ditto 2GB drive!

image

Here’s my repository on GitHub that has the source code for the ftape driver, with my modifications for getting it to build in CentOS 3.5.


In other news, I found and restored an old ThinkPad X131e, which came to me as a Chromebook, i.e. with ChromeOS installed. In order to remove ChromeOS and install a regular Linux distro, I had to overwrite it with custom firmware that allows installing other operating systems. And in order to overwrite the firmware, I had to disassemble it and flip a physical write-protect switch that allows the firmware to be written. Why do they do this?! Anyway, with the latest version of the lightweight Xubuntu installed, this tiny thing works beautifully, and can now have a second life.

image

Brain dump, February 2023

As a software archaeologist I often find myself trying out old software that I hadn’t used myself in my own career. I think this can be very instructive, since old software can often have some good ideas built into it, ideas that might have been forgotten, but nevertheless ideas from which we can draw when building today’s software.

Recently I played around with Microsoft QuickC for Windows 3.1, which was a C development environment (IDE) targeted at individual developers, and had a rather modest set of features compared to enterprise-caliber IDEs of the era. Nevertheless my existing knowledge of Windows programming, coming from Windows 9x development and onward, transferred fairly easily onto QuickC, and I was able to develop a sample app fairly quickly:

image

It’s a Mandelbrot viewer/explorer app, which is one of my favorite “sample” apps to build in a new environment. It runs in any version of Windows 3.x, has no dependencies, and weighs in at 20KB. Here is the source code, if you like!

What struck me about using QuickC is the simplicity and efficiency of it. Even though it still has the familiar issues of native Windows programming — many screenfuls of boilerplate code and having to manually handle message loops and drawing subroutines — after this was out of the way, the sailing was smooth.

Today I make Android apps for a living, and I can’t help but compare the user experience of building an Android app (using Android Studio) to the experience of building old-school Windows apps, specifically in the way of efficiency. The compilation time of my QuickC app was no more than a few seconds (in an emulator that was emulating a 50 MHz PC). Compare this with building a similar Android app, where kicking off a clean Gradle build is a cue to take a coffee break, even on the most modern hardware. Of course over the years the Gradle build process has gotten faster, and the Android folks at Google are quick to award themselves a medal for improving build speeds by a few seconds. Still, it’s only very recently that Gradle has gotten fast enough to finish building a Hello World app in under a minute. I won’t even get into the sizes, now measured in gigabytes, that modern IDEs require to make themselves at home on our workstations, whereas the entirety of QuickC was able to fit on three floppy disks.

Is this kind of level of efficiency and streamlining squarely in the distant past of software tools, or can we in the present day take steps to get back to that spirit?

ECC RAM should be a human right

I am now a staunch advocate for ECC RAM, after the events of last week. You see, over the last several weeks my main desktop workstation has been misbehaving, with occasional freezing and crashing. After some diagnostics I began to suspect a faulty RAM module, and sure enough, upon performing a quick run of memtest86, it lit up the screen with a multitude of bit flip errors, at numerous memory locations, indicating that something was seriously wrong with the RAM.

Within a day or two I scrambled to replace the RAM modules with new ones, and when this was done the problems resolved themselves and everything was stable again. However, there was another more sinister side effect that I discovered shortly afterwards: Some of my data was corrupted! That’s right, it was the worst-case scenario for RAM failure: bit flip errors that get written back to the disk. I discovered that several video files that I had been editing had corrupted bits, and were no longer usable. Fortunately I still have the original source materials for the videos which I can use to recreate the final videos. It’s an unfortunate waste of time, but it could have been a lot worse if I’d let the RAM failure go on even longer. There doesn’t seem to be any further corruption in any more of my personal data, and just to be on the safe side I performed a clean install of Windows, to ensure that no system files or program files are corrupted.

The point of the story is that the data corruption was completely preventable, if only my RAM had ECC built into it. But because it doesn’t, these kinds of bit flip events go completely undetected, and proceed to wreak havoc on the integrity of our data, right under our noses.

Memory manufacturers assure us that desktop RAM is so reliable that it doesn’t need ECC, that the probability of bit flip events is so low that it’s not worth the extra “cost” of ECC. Chip manufacturers (i.e. Intel) produce CPUs that don’t even support ECC memory. Users are expected to upgrade to server-grade components just to get access to ECC memory.

Let’s quickly review the reasons why server-class machines are deemed to be “deserving” of ECC memory, while desktop machines are not:

On a personal desktop computer, your data is stored permanently on a disk, whether it’s a spinning hard drive, SSD drive, memory card, and so on. When you want to do something with your data (e.g. write a document, edit a photo, etc), the data is loaded into RAM, and when you’re finished modifying your data, it’s written back to the disk.

On a server machine, however, the situation is different: since disk access is much slower than RAM access, the server must keep as much data as possible in RAM, so that the data is instantly available to clients who request it. This means that the data ends up sitting in RAM for extended periods of time. If the RAM were to experience bit-flip errors that went undetected, the server would serve incorrect data, or worse, would end up writing incorrect data back to the disk. Therefore, the server’s RAM has ECC, so that it will correct itself in case of an occasional bit flip.

This is oversimplifying a bit, but the difference between a server and a desktop, for this exercise, is simply the amount of time that data is made to sit in RAM. So then, are we supposed to accept that if our data doesn’t remain in RAM for very long, it doesn’t need ECC at all?!

By the way, you’d better believe that your disk(s) have all kinds of error correction schemes built into them, which work automatically and transparently. It’s completely normal for data written to a physical medium to be imperfect, and those imperfections will be corrected by the firmware of the disk.

Well guess what? RAM is a physical medium, and yet we’re simply asked to take the manufacturers’ word that our RAM is reliable enough to never need ECC for the use cases of a desktop workstation. Well, I’m here to say that these practices are reckless, and represent a ticking time bomb for anyone who uses non-ECC memory for anything nontrivial. And it seems I’m not the only one.

(discussion on HackerNews)