The lost art of system-level thinking

Our house has steam heat – you know, those old cast iron radiators that are heated by a boiler that sits in the basement. When we first moved in, I was skeptical of how well steam heat actually works, and I began to dread the idea (and the cost!) of having to replace the whole thing with modern forced-air heat. And when we powered on the heat for the first time, my suspicions were confirmed: some of the radiators were making strange noises, and sometimes the pipes themselves would make loud bangs that reverberate through the whole house.

But now, after just a few months, nearly all of these issues are solved, and I am a believer in steam heat and its many benefits.   This is all thanks to a book called The Lost Art of Steam Heating by Dan Holohan.   The book allowed me to solve our heating issues after roughly 20 pages (the pressuretrol was set way too high, if you’re reading, Dan), but I couldn’t put it down and kept reading until the end, for an entirely different reason: the book does a great job of emphasizing system-level thinking, as opposed to narrow immediate problem solving. Even though the book is intended more for HVAC contractors than individual homeowners, I believe that any serious engineer would find the stories and advice relatable.

With every heating “war story” that Holohan recounts, the solution invariably relies on widening one’s perspective beyond the single component that seems to be broken. By the end, I had forgotten that I was even reading about steam heat, but was simply enthralled by Dan’s ability to re-frame the problem from the highest level down. I wish that more engineers in my own field would do the same.      

Problems with steam heat turn out to be easy to diagnose.   It’s simple physics, if at times slightly counterintuitive. The harder lesson is to always think about the big picture. Don’t just replace a faulty component with a new one; think about how the new component affects the rest of the system. Don’t remove a component that you think is unnecessary without understanding why it was there to begin with. And so on.

The lost art, it would seem, is not really steam heat; the lost art is big-picture thinking itself.

“Think about the system.”   Well said, Dan.

How to recover data from QIC tapes

Simple: ask me to do it for you!

But if you insist on trying it yourself, here is a rough guide on the steps required to recover data safely and effectively from QIC-150 and QIC-80 cartridges.

QIC-80

First there is the matter of hardware. You’ll need to obtain a QIC-80 tape drive, such as the Colorado 250MB drive which was very common at the time. These are drives that usually connect to the floppy controller on the PC’s motherboard. There are a few types of drives that connect to the parallel port, but these are not recommended since they are much less compatible with various software.

Now you have a few choices. You may choose to take a binary image of the tape, which will be literally a dump of all the data on it. This can be done with Linux using the ftape driver.   Or you can attempt to use the original software that was used to write the tape. This would require you to stage the specific operating system and backup software, boot into it, and use it to restore the data from the tape.

Getting a binary image

This option is more straightforward, and also faster and more reliable, but the disadvantage is that you’ll need to manually decode the data and extract the files from it. Fortunately the data written to QIC-80 tapes mostly adheres to a single specification, and there are ready-made tools to decode this format.

To get a binary dump, you’ll need to boot into Linux. However, because the ftape driver has long been abandoned, it’s only available in very old distributions of Linux. The last version of Ubuntu that included ftape in the kernel was 6.06. Fortunately this version is readily available for download and can be used as a bootable live CD. Once it’s booted, you can load the ftape module by executing:

$ sudo modprobe zftape

This should create the appropriate logical devices that will let you access the tape drive. The device you’ll usually need is /dev/nqft0.

And to start reading data immediately, just execute dd as usual:

$ sudo dd if=/dev/nqft0 of=data.bin conv=sync,noerror &

Don’t forget the ampersand at the end, so that dd will run in the background and give you back control of the console.   The conv=sync,noerror parameter will make dd continue if it encounters errors and pad the output with zeros in case of a bad block. Although the skipping of errors hasn’t seemed to work very reliably with QIC-80 drives. If the drive goes into a loop of shoe-shining the tape for more than a minute, you should probably give up on that volume of the tape. Speaking of volumes:

The tape may consist of multiple volumes, which basically means that multiple backups were written to it in succession.   When your first dd call is complete, it will stop at the end of the first volume on the tape. But there may be additional volumes. You may call dd again right afterwards, which will proceed to read the next volume, and so on. You can also use the vtblc tool to see an actual list of the volumes on the tape.

You may also want to skip directly to another volume on the tape. This is useful if you encounter errors while reading one volume, and want to jump directly to another volume. I’ve found that the best bet is to perform a fresh boot, then skip to the desired volume, and start reading.   To skip to a volume, use the mt fsf command:

$ sudo mt -f /dev/nqft0 fsf x

…where x is the number of volumes to skip. So for example if you want to read the third volume on the tape, execute fsf 2 and start reading.

Note that the drive might not actually fast-forward as soon as you make the mt fsf call. It will usually fast-forward when you actually make the dd call to start reading data.

Using original backup software

If you want to go the route of using the original backup software that was used to write the tape, you’re now in the Wild West of compatibility, trial and error, and general frustration.   Most of the frustration comes from the old software’s incompatibility with modern CPUs (too fast) and modern RAM (too much).

Since the majority of these tapes were written during the DOS era, you’ll need to get a solid DOS environment going, which is surprisingly simple with today’s hardware. If your motherboard supports booting from a USB drive, it will probably be able to boot into DOS. This is because DOS uses the BIOS for disk access, and the motherboard provides access to the USB disk through the BIOS, so that DOS will consider the USB disk to be the C: drive.

There are a lot of different tape backup tools for DOS, but one that I’ve found to be very reliable is HP Backup 7.0. This software has recognized and recovered the vast majority of DOS backups that I’ve seen.   If this tool fails to recognize the tape format, try one of these other tools:

Central Point Backup

This is bundled with PC Tools 9.0. This is another DOS-based backup tool, but it could write the backup in a slightly different format. However, there are very specific steps for getting this software to work.   It does not work on modern (fast) CPUs because it relies on timing logic that causes an integer overflow. This can manifest as an “overflow” error or a “divide by zero” error.

To run Central Point Backup on a modern processor, you will first need to run the SlowDown utility from Bret Johnson.   I’ve found that these parameters work:

C:\> SLOWDOWN /m:25 /Int70

Note that this will cause the keyboard to become sluggish, and you might have some trouble typing, but it’s the only way.

NTBackup

Windows NT came with its own backup utility that could be used to write to floppy tapes. The trouble, however, is getting Windows NT to boot on a live modern system.  The goal is to get a boot disk that runs Windows NT Service Pack 6, which does in fact work well with modern hardware.  If you want to do this from scratch, you can try the following:

  • Connect a spare SATA hard drive (a real one) to your computer.
  • Boot into Linux and make sure to have qemu installed.
  • Run qemu, booting from the Windows NT install ISO image, and having the real hard drive as the emulated disk. For the initial installation, give qemu more modest parameters, including less memory (-m 256) and a lesser CPU (-cpu pentium).
  • After Windows NT is installed, power down the emulated machine, and copy the Service Pack 6 update executable onto the disk.
  • Power the emulated machine back up and install SP6.
  • You can then power it down, and you now have a hard drive loaded with Windows NT SP6, ready to be booted on real modern hardware.

Microsoft Backup (Windows 95)

Windows 95 is extremely tricky to get working on modern hardware, to the point where I would not even recommend attempting it. It may be possible to apply the AMD-K6 update patch, which supposedly allows it to run correctly on fast processors, and then apply the PATCHMEM update that allows it to support large amounts of RAM, but I have not had success with either of these. For me, Windows 95 is forever relegated to running in an emulator only. And fortunately I haven’t seen very many floppy tapes that were written using the backup utility from Windows 95.

QIC-150 and other SCSI tape drives

Reading data from QIC-150 tapes, or most other types of tapes from that time period, is slightly different from reading QIC-80 tapes, mostly because the majority of these types of tape drives connect to the SCSI interface of your PC. This means you’ll need a SCSI adapter that plugs into your motherboard. I’ve had a lot of success with Adaptec UltraWide cards, which are PCI cards, meaning that you’ll need a motherboard that still has older-style PCI slots.

And of course you’ll need a QIC-150 tape drive, such as the Archive Viper 2150, or the Tandberg TDC3660. Newer models of tape drives might be backwards-compatible with older types of tapes, but make sure to check the compatibility list for your drive before attempting to use it to read a tape.

Extracting the data from a tape is extremely simple using Linux. The most recent Linux distributions should work fine (as of 2020). If your tape drive is connected correctly to your SCSI adapter (and terminated properly using a terminating resistor), it will be detected automatically by Linux and should appear as a tape device, such as /dev/nst0.

To start reading data from the tape, execute the following:

$ sudo dd if=/dev/nst0 of=foo.bin conv=noerror,sync

See the previous section on QIC-80 tapes on further usage of dd, and how to read multiple volumes of data from the tape.

In my travels I have also seen tapes that have a nonstandard block size (i.e. greater than 512 bytes). This may manifest as an error given by dd such as “Cannot allocate memory.” In these cases, you can try setting the block size to a generous amount when invoking dd:

$ dd if=/dev/nst0 of=foo.bin conv=noerror,sync bs=64k

A large enough buffer size should fix the allocation error, but if you plan to use it with the “sync” option, then you must know the exact size of the buffer (i.e. the exact block size used by the tape). Otherwise the blocks will be written to the output file with padding that fills up any unused space in each block buffer. A common block size I’ve seen is 16k, especially in 8mm tapes.

Using original backup software

Of course it is also possible to use the original backup software that was used to write the tape. However, it’s much safer to obtain a binary dump of the tape in Linux first, before attempting to read the tape again using other tools. This way you’ll have a pristine image of the tape in case the tape becomes damaged or worn out during subsequent reads.

In many cases there are software tools that will extract the archived file collection directly from a binary image. But if these tools do not recognize the format of your tape image, you will indeed have to use the original software that was used to write it, assuming you can remember what it was. This can be quite difficult: setting up SCSI support in DOS can be a pain; the tape might not have been written using DOS at all, but something like Amiga, and so on. Regardless, the major hurdle is getting the data from the tape to the PC. Decoding the contents of the data is usually a minor detail.

…Or, if you don’t feel like it

I offer first-rate tape recovery services, at a fraction of the cost of other companies. Get in touch anytime and let me know how I can help!

Discovering little worlds

Like so many other people during the COVID lockdown, I’ve been looking for additional hobbies that could be done from home, which would occupy my time and help keep my mind off the collapse of civilization as we know it, and maybe even ground my thoughts and keep them away from hyperbole and catastrophizing.

While cleaning and organizing my basement I came across this small USB device. It’s barely larger than a flash drive, but it’s no flash drive at all – it’s a software-defined radio (SDR), namely an RTL-SDR V3 dongle.

I don’t even recall how this device ended up in my possession;  I think it was probably from one of my previous jobs:  they were throwing away a bunch of equipment that was no longer useful, and allowed me to keep some of the items.  Regardless, I had never actually used the SDR, and hadn’t really thought about what the SDR would be useful for.  I had a vague understanding that the SDR can let me tune in to any random radio frequency, but how interesting can that be?  Well, it turns out that playing around with this device led me into a rabbit hole of epic proportions.

Once I found the right software to work with the SDR device (SDRSharp for Windows, and CubicSDR for Mac), I was up and running.  The first rather trivial thing to do was to tune in to the local FM radio stations. Here they are, as viewed through a spectrogram:

FM radio

But that’s kind of boring. I wonder what lies outside of the FM radio band? Well, the next obvious destination is the local police frequencies, which are around 460 MHz to 490 MHz in my area. These are narrow-band FM (NFM) stations, so we adjust our software settings accordingly. In mere moments, I’m listening to police dispatchers communicating with units and telling them about robberies, car accidents, and the like:

And of course there are a few local HAM radio repeaters nearby, which tells me that the HAM community is very much alive and well.  Since I can’t transmit anything using my tiny SDR device, I can only listen in on the HAM conversations, but that’s okay since the conversation wasn’t particularly scintillating anyway, and I’m not sure that getting into the little world of HAM radio is really my goal here. As much as I salute the enthusiasts who keep HAM radio going, they can party on without me.

Mind you, all of this was using the cheap tiny antenna that came with the SDR itself.  But then I discovered that the SDR can be used for something else entirely: receiving signals from satellites!

Arguably the easiest satellites to pick up signals are the NOAA 15/18/19 satellites, which are weather satellites that transmit images of cloud cover over the ground. By “easiest” I mean requiring the least amount of additional equipment:  it only requires a rabbit-ear (V-dipole) antenna connected to your SDR, and a cloud-free day to get a good signal. Here is the signal at 137.9 MHz, and the resulting image, which is produced by special software that demodulates the “audio” data that was recorded:

The downside is that these satellites are in a sun-synchronous orbit, and will only pass by your location for ~15-minute intervals at the most, and can only be caught at very early or late hours of the day. The other downside is that the NOAA satellites are aging, and will probably be decommissioned in the coming years. And anyway, the images they transmit are not the highest quality. Time to step it up to the next level, namely the GOES satellites!

The GOES-16 satellite is a newer weather satellite that is geostationary, and is positioned permanently above the Americas. In fact its longitude is almost exactly over the East coast, which is perfect for my purposes, and its inclination from my location is about 45 degrees (because it orbits around the equator, as all geostationary satellites must do).

But because it’s geostationary it’s also much farther away, and therefore its signal is much weaker, and requires additional equipment:

The setup consists of an old WiFi grid antenna, which feeds into a SAW filter and amplifier, which then feeds into the SDR that’s now connected to a Raspberry Pi (the total cost was about $100).  The Raspberry Pi is running a package called goestools which demodulates the signal from the satellite in real time, and translates the signal into images. The satellite transmits images of Earth in many different spectral bands, ranging from visible light to deep infrared.

And so, the final little world I discovered on this adventure is this one:

The full resolution of these images is 10K, which is mind-blowing, and my next step is to create animations from these images, which are sent by the satellite every 15 minutes.

I think I’ll leave this antenna setup as a permanent installation in my house, so that I can grab these signals anytime I like. Even though this imagery is available on the web if you know where to look, there is something profoundly awesome in knowing that you personally can receive selfies of our world, from 35,000 kilometers away, using about $100 worth of equipment. It’s been a very satisfying few weeks, in spite of everything else that’s happening this year.

Thoughts on the new Star Trek

I watched Star Trek: Picard recently, and… I am sad.

Gene Roddenberry had a vision: a future which is truly post-racial, post-war, post-poverty, etc. It’s a world for us to strive towards, to admire, to want to live in.  But the world of Star Trek: Picard seems to have all the same problems we have in the 21st century.  The Federation is a systemically racist organization that refuses to help an enemy in a time of desperate need. There is deep wealth inequality between different classes of people on Earth. People treat sentient androids like property. And all of “space” is a hostile battleground where one dares not venture without being armed to the teeth.

That’s not a world I look forward to living in, and it would be depressing to me if this is how humanity “turns out” in three hundred years.

Aside from pontificating about today’s political issues, the show’s plot is completely incoherent, and the writing is so lazy and unfocused. Remember Picard’s caretakers at his château who were former Tal Shiar? Those were some interesting characters, but will we ever see them again? Was there any point in having the Borg involved in the story at all? Does the Romulan Samurai kid (bet you can’t think of his name!) have any purpose than to chop people’s heads off? Will Agnes’s murder of Bruce Maddox be swept under the rug? Will the fact that Picard is now a synthetic golem ever be mentioned again?

One perfect example of lazy writing is embodied in the hand-held purple repair device that the Androids conveniently give to Captain Rios. This device basically grants wishes: you can wish it to repair your broken warp core! Or you can wish it to create a mirage of a hundred starships, complete with warp signatures that can fool Romulan sensors! What luck!

And think about how Old Trek and New Trek are different in terms of fandom. There are fans who transform their basement to look like the bridge of the Enterprise. There are fans who program their desktop computer to look like an LCARS interface. And of course there are countless fans who attend conventions dressed up like characters from the Original Series and Next Generation. But will there be any fans who’ll want to recreate the bridge of Captain Rios’s ship (bet you don’t know what it’s called)? Will there be any fans who will admire or want to emulate any of these new characters?

Is it possible anymore to have a show where the whole universe isn’t about to blow up all the time? Can we just have a show where the Enterprise goes to a planet, and Picard negotiates a peace accord, while Data and Geordi get into a wacky holodeck adventure?  I ask for so precious little!

The problem of recovering data from SSD drives

One frequent question I receive from users of DiskDigger is: Why does it seem to be unable to recover data from my internal SSD drive? And since SSDs have become nearly ubiquitous in laptops and desktops, this question is becoming more and more common.

The short answer: It is generally not possible to recover deleted data from internal SSD drives, because they are very likely using the TRIM function.

How do I know if TRIM is enabled?

It probably is. If you have an SSD drive that is internal to your computer (NVMe drive, SATA drive, etc), and you’re using a modern operating system (Windows 7 and newer, macOS, etc), then it’s likely that TRIM will be enabled by default, because it’s highly beneficial to the performance of your SSD drive.

Why?

SSD (flash memory) drives work fundamentally differently from older magnetic (spinning disk) hard drives.

With both types of drives, when data is deleted, the physical blocks that were occupied by the data are marked as “available”, and become ready to be overwritten by new data.

With a magnetic spinning hard drive, an available block can be overwritten regardless of what data was in that block previously; the old data gets overwritten directly. However, the same is not true for flash memory: a flash memory block must be erased explicitly before new data is written to it. And this erase operation is relatively expensive (i.e. slow). If an SSD drive was to erase memory blocks “on demand”, i.e. only when a new file is being written, it would slow down the write performance of the drive significantly.

Therefore, an SSD drive will erase unused memory blocks preemptively, so that the memory will be pre-erased when a new file needs to be written to it. Since the drive has no knowledge of what filesystem exists on it, the drive relies on the operating system to inform it about which memory blocks are no longer used. This is done using the TRIM command: When the operating system deletes a file, in addition to updating the necessary filesystem structures, it also sends a TRIM command to the drive, indicating that the memory blocks occupied by the deleted file can now be considered “stale”, and queued up for erasing.

The SSD drive erases TRIMmed blocks in the background while the drive is idle, transparently to other operations. In effect this means that for any file that’s deleted from an SSD drive, once the drive purges those stale blocks, the actual contents of the file will be wiped permanently from the drive, and will no longer be recoverable.

The above is a slight simplification, since SSD drives also perform wear-leveling which uses rather complex logic involving copying and remapping logical addresses to different physical memory pages, but the general point stands.

Exceptions

There are a few cases when deleted data may be recoverable from an SSD drive:

  • If TRIM happens to be disabled for some reason. As mentioned above, the TRIM feature is something that is enabled at the level of the operating system. It is usually enabled by default for performance reasons. Nevertheless, most operating systems will let you check whether or not TRIM is enabled, and optionally disable it. For example, in Windows you can run the command fsutil behavior query disabledeletenotify to see if TRIM is currently enabled.
  • If you’re using an external SSD drive connected over USB. Support for issuing the TRIM command over a USB connection is relatively new, and is not yet supported by all USB controllers and operating systems. If you deleted files from an external SSD drive that’s connected to a USB port, there’s a fair chance that the data might be recoverable.
  • If you attempt to recover the files immediately after they’re deleted, and the drive provides the contents of stale blocks (which is rare). As mentioned above, the TRIM command puts the deleted memory blocks in a queue of stale blocks, so it’s possible that the SSD drive won’t actually erase them for a short while. The timing of when exactly the TRIMmed blocks are erased is entirely up to the drive itself, and differs by manufacturer. If you search the drive for deleted data sufficiently soon after it’s deleted, and the drive doesn’t return null data for stale blocks, it may still be possible to recover it.
  • Due to the way that SSD drives perform wear-leveling, it may be possible for stale blocks to get reallocated and copied to different physical positions in the drive, leaving behind the original data in their old locations. Unfortunately this kind of data is generally not accessible using any software tools, including DiskDigger, and can be accessed only by disassembling the drive and reading the physical flash memory chip directly, which is a very expensive procedure done by enterprise-level data recovery labs.

Summary

Despite the above challenges, there’s no harm in trying to use DiskDigger to recover files from your SSD drive, and in certain cases it will be successful. However, if you’ve deleted files from an internal SSD drive, the overall prognosis for recovering them is unfortunately not good.