Brain dump, July 2022

(Reminder: I perform a service of recovering data from old media such as magnetic tapes, floppies, weird flash memory formats, etc. Get in touch with anything you’d like to recover.)

Performed a very interesting data recovery case:
I received a number of QIC-150 tapes, and one QIC-525 tape, from a client in Tasmania (!) who made backups onto these tapes from an IBM AS/400 system in 1991. This presented a couple of challenges:

  • The tapes were written with variable-length block sizes, so we need to configure the tape driver accordingly by running mt -f /dev/nst0 setblk 0.
  • And then, when using dd to dump the tape contents, specify a sufficiently large block size to read the largest of the variable block sizes, which turned out to be 32760, so the command was dd … bs=32k.
  • After the data was dumped successfully, I determined that the data was saved using the SAVLIB command on the AS/400, which serializes the data onto the tape in a proprietary format. Unfortunately I couldn’t find a lot of public documentation on this format, so I had to do a bit of reverse-engineering. My client specifically wanted to locate some COBOL source files in these archives, which turned out to be easy to pinpoint and extract. Most of the COBOL code was uncompressed, but some of the files were compressed using SCB (string control byte) encoding, which is basically a form of simple run-length compression.
  • Finally, since the data came from an AS/400 system, the text contents of the files were EBCDIC-encoded, which required a conversion from EBCDIC to ASCII.

One of my first professional programming contracts was done in 1997, while I was still in high school. I built an MS-DOS application that communicated over a serial port with an industrial iron powder machine (i.e. a large apparatus that melts scrap iron and turns it into powder, to be recycled for other uses). This program would display real-time telemetry from the machine, and then output it to a daily log file. It was a very simple and rudimentary program, but it worked well enough that the company continued to use it for years. It was also enough to land me a full-time job at the same company, but that’s a subject for a longer post.

Although I still have the original executable file that I provided, the unfortunate thing is that I’ve lost the source code for this program. The only thing about it that I remember is that it was written in C, and compiled with Borland C++ v3. Therefore I’ve had a mini-quest in the back of my mind to either find the lost source code, or decompile the executable file and reconstruct the source code.

Recently I’ve come the closest I’ve gotten so far to reconstructing the code, all thanks to the excellent Reko decompiler. I opened the executable in Reko; it detected the Borland C v3 runtime effortlessly; it showed a list of function calls (which I recognized! finally!), and I was up and running navigating the disassembled functions:

The Reko decompiler working with my MS-DOS app.

The hurdle now will be to rewrite the source code based on the disassembly, so that it compiles as closely as possible to the original executable. There’s still a good amount of work left to do, but it’s now much more manageable because of Reko.

In case anyone’s interested, here is the program actually running in DosBox, albeit stuck in an “error” state because it’s not receiving any data from the aforementioned iron powder machine:

My DOS app from 1997 running in DosBox.