Blog

Reverse engineering a 25-year-old Visual Basic app

Following up from last week’s misadventures with the Avant Stellar keyboard (trying and failing to extract macro information from the keyboard’s internal memory), there was another glimmer of hope:  my friend found a backup file that possibly contains all the macros that were saved to the keyboard.  If I could just reverse-engineer this backup, we could extract the macros directly from the file.  It is a 2 KB file with a .KBD extension, unrecognizable as any binary format I’ve seen to date. Here is a partial hex dump of the file:

It’s pretty clear that the file contains a key mapping, as evidenced by the list of incrementing 32-bit numbers at the beginning, up to offset 0x210.  There are roughly 120 increasing numbers, which is roughly the number of keys on the keyboard, so we can safely assume that this is the key mapping.  After the key mapping, I presume, comes the macro information, and this is where things get tricky, since there’s virtually no way to tell how the macros are encoded in the file. The data simply looks too general to make sense of.

An obvious possibility would be to “load” the backup file into the Avant software tool that came with the keyboard, and visually inspect the macro(s) assigned to each key.  But no matter what I tried, the software would not load the file.  Or rather, it loaded the key mapping, but not the macros.  Time to think about the nuclear option: disassemble the Avant software and see how it’s actually processing the backup file.

Looking at the folder contents of the Avant software tool, I immediately notice a dead giveaway: VBRUN300.DLL, which means this tool was written in Visual Basic 3.0.  This makes our job much easier, because there are actually ready-made tools for decompiling Visual Basic executables. (If you recall, Visual Basic compiles executables into p-code instead of native machine code, which makes them much more straightforward to decompile.)  All of this took me quite a while to remember, because I hadn’t used these tools since my early, early hacking days, and it took a little while longer to find them in my archives!  The go-to utility for performing this task was literally called VB3 Decompiler, and the way to find this tool on the web today is… outside the scope of this post.

The decompilation basically results in several Visual Basic source files, in which the original function names are intact, but the local and global variables are changed to generic identifiers, since those names are not stored in the compiled code. It takes a little bit of further massaging to get these files to actually build within Visual Basic, but after that, it’s almost as if you have the original source code of the program at your fingertips.

There was one other minor hurdle because the Avant software uses custom UI components (.VBX files) that don’t allow themselves to be used in Design mode (as part of a copy-protection or licensing mechanism), but this is bypassable using another utility in the decompiler suite that “fools” Visual Basic into loading the components anyway.

With the source code buildable and debuggable, we can now easily run the program and load the .KBD backup file, and trace through where it processes the data in the file:

Even though the variable names aren’t very descriptive in the above screenshot, it’s easy enough to spot the loop that deserializes the keyboard macros, and how each macro is composed.  Not only that, but we can determine what was preventing it from displaying the macros in the first place — it turned out that it expects the keyboard to be physically connected while running, and while I’m pretty sure that we tried loading the backup with the keyboard attached, it wasn’t working anyway, probably because the keyboard is malfunctioning and no longer able to communicate properly.  But at last, with this requirement bypassed, the macros that were loaded from the backup file finally reveal themselves:

Confirmation dialogs

Recently a friend of mine contacted me with an interesting issue.  He got ahold of a keyboard from an old PC workstation used with some legacy accounting software. But this was no regular keyboard — it was an Avant Stellar keyboard, in which all of the keys were remappable, and any key could be programmed with custom macros.

The original owner of the keyboard was no longer at the accounting firm, but my friend was very interested in determining what macros were assigned to each key, so that the accounting firm could use the old software more effectively, and hopefully transition away from it more easily.

I helped by managing to dig up the original software that shipped with these keyboards, which worked with MS-DOS and older versions of Windows. Here is what the software looks like:

Clearly this software is where the user gets to create their own macros and remap all of the key bindings. No less clearly, the software allows us to “Upload” and “Download” the mappings.  So, naturally, my friend thought the most sensible action would be to “Download” the current state of the keyboard and view all the macros in the UI of this software tool.  And so, he clicked the Download button, and… nothing seemed to happen. After a brief progress message, the interface stayed the same.

Now here’s the question: What does “upload” and “download” mean?  In 2020, download generally means “fetch something from an external source and save it onto the computer,” and upload means “send something from the computer to an external source.”  And you might think, in the context of this keyboard, download means “retrieve the current state of the keyboard onto the computer”…

But sadly, twenty years ago, the programmers of this software had the opposite definition of “download” in their minds.  Downloading meant loading the current mappings from the software onto the keyboard!

And even more sadly, the programmers didn’t include a prominent confirmation dialog that says, “CAUTION: this will load the new mapping onto the keyboard and overwrite any previous settings!”  And with a single click, the keyboard was overwritten without any warning or backup.  The only thing the programmers did was include a tooltip that appears when hovering over the Download button:

…but the tooltip appeared only after it was already too late.

Recovering data from QIC-80 tapes: another case study

In another of my recent data recovery cases, the patient was a QIC-80 (DC 2000 mini cartridge) tape that was in pretty bad shape. From the outside I could already see that it would need physical repairs, so I opened it up and found a harrowing sight:

The problem with QIC cartridges in general is that they use a tension band to drive the tape spools. If you look at a QIC cartridge, it’s completely enclosed, except for a plastic wheel that sticks out and makes contact with the capstan mechanism when it’s inserted into the tape drive.  The flexible tension band is tightened in such a way that it hugs the tape spools and drives them using physical friction.

This tension band is a major point of weakness for these types of tapes, because the lifespan of the band is very much finite.  When the tape sits unused for many years, the band can stiffen or lose its friction against the tape spools, which can result in one of two scenarios:

The tension band can break, which would make the cartridge unusable and would require opening the cartridge and replacing the band. This is actually not the worst possible outcome because replacing the band, if done properly, isn’t too disruptive of the tape medium itself, and usually doesn’t result in any data loss.

A much worse scenario is if the tension band becomes weakened over time, such that it no longer grips the spools properly, so that the next time you attempt to read the cartridge, it will spin the spools inconsistently (or cause one of the spools to stall entirely), which will cause the tape to bunch up between the spools, or bend and crease, creating a sort of “tape salad” inside the cartridge, all of which can be catastrophic for the data on the tape. In this kind of case, the cartridge would need to be disassembled and the tape manually rewound onto the spools, being extremely careful to undo any folds or creases (and of course replace the band with a new one, perhaps from a donor tape). This will almost certainly result in loss of data, but depends greatly on the degree to which the tape was unwound and deformed.

Note that the tape drive that is reading the tape is relatively “dumb” with respect to the physical state of the tape. It has no way of knowing if the tension band is broken, or if the tape isn’t wound or tensioned properly, or if what it’s doing is damaging the tape even further. Great care must be taken to examine the integrity of the tape before attempting to read it.

With this cartridge, it’s clear that the tension band has failed (but didn’t break). The tape has obviously bunched up very badly on both spools.  Less obviously, the white plastic wheels at the bottom show evidence that the tension band has degraded, with bits of residue from the black band being stuck on the wheels. The fix for this cartridge was to remove the bad tension band, clean the white plastic wheels, respool and tighten the tape, and install a new band from a donor tape. After the procedure was complete, more than 99% of the data was recovered. The tape header was readable, as were the volume tables. Only a few KB of the file contents were lost.

Therefore, when recovering data from very old QIC tapes, it’s probably a good idea to replace the tension band preemptively with a known-good band, to minimize the chance of breakage and damage to the tape. This is why I keep a small stockpile of new(er) tapes from which I can harvest the tension band when needed. At the very least, it’s a good idea to open up the cartridge and examine the band before making any attempts to read data from it.

Where did we go wrong?

I started programming seriously in the late 1990s, when the concept of “visual” IDEs was really starting to take shape. In one of my first jobs I was fortunate enough to work with Borland Delphi, as well as Borland C++Builder, creating desktop applications for Windows 95.  At that time I did not yet appreciate how ahead of their time these tools really were, but boy oh boy, it’s a striking contrast with the IDEs that we use today.

Take a look: I double-click the icon to launch Delphi, and it launches in a fraction of a second:

But it also does something else: it automatically starts a new project, and takes me directly to the workflow of designing my window (or “Form” in Borland terms), and writing my code that will handle events that come from the components in the window. At any time I can click the “Run” button, which will compile and run my program (again, in a fraction of a second).

Think about this for a bit. The entire workflow, from zero to building a working Windows application, is literally less than a second, and literally two clicks away.  In today’s world, in the year 2020, this is unheard of.  Show me a development environment today that can boast this level of friendliness and efficiency!

The world of software seems to be regressing: our hardware has been getting faster and faster, and our storage capacity larger and larger, and yet our software has been getting… slower. Think about it another way: if we suppose that our hardware has gotten faster by two orders of magnitude over the last 20 years, and we observe that our software is noticeably slower than it was 20 years ago, then our software has gotten slower by two orders of magnitude! Is this… acceptable? What on earth is going on?

Laziness

Engineers like to reuse and build upon existing solutions, and I totally understand the impulse to take an existing tool and repurpose it in a clever way, making it do something for which it wasn’t originally intended. But what we often fail to take into account is the cost of repurposing existing tools, and all the baggage, in terms of performance and size, that they bring along and force us to inherit.

Case in point: suppose that the only language you know is JavaScript, and suppose that you wanted to start building desktop applications, but didn’t want to learn the languages and tools normally associated with desktop development, e.g. C++, C#, etc. What can you do? Well, one option would be to build a compiler from scratch, which would actually compile JavaScript into native machine code. But that would be hard. How about a simpler solution: take a full-blown web browser, and literally bundle it as the engine that will run your desktop app, with the logic of your app being in JavaScript, and the “window” of your app becoming a web page that is run by the bundled browser! This is, of course, the idea behind Electron, an alarmingly popular framework for building desktop apps today.

But what about the cost of using Electron? What is the cost of bundling all of Chromium just to make your crappy desktop app appear on the screen? Just to take an example, let’s look at an app called Etcher, which is a tool for writing disk images onto a USB drive. (Etcher is actually recommended by the Raspberry Pi documentation for copying the operating system onto an SD card.)

We know how large these types of tools are “supposed” to be (i.e. tools that write disk images to USB drives), because there are other tools that do the same thing, namely Rufus and Universal USB Installer, both of which are less than 2 MB in size, and ship as a single executable with no dependencies. And how large is Etcher by comparison? Well, the downloadable installer is 130 MB, and the final install folder weighs in at… 250 MB. There’s your two-orders-of-magnitude regression! Looking inside the install folder of Etcher is just gut-wrenching:

Why is there a DLL for both OpenGL and DirectX in there? Apparently we need a GPU to render a simple window for our app. The “balenaEtcher” executable is nearly 100 MB itself. But do you see that “resources” folder? That’s another 110 MB! And do you see the “locales” folder? You might think that those are different language translations of the text used in the app. Nope — it’s different language translations of Chromium. None of it is used by the app itself. And it’s another 5 MB. And of course when Etcher is running it uses 250+ MB of RAM, and a nonzero amount of CPU time while idle. What is it doing?!

As engineers, this is the kind of thing that should make our skin crawl. So why are we letting this happen? Why are we letting software get bloated beyond all limits and rationalize it by assuming that our hardware will make up for the deficiencies of our software?

The web

The bloat that has been permeating the modern web is another story entirely. At the time of this writing, the New York Times website loads nearly 10 MB of data on a fresh load, spread over 110 requests. This is quite typical of today’s news websites, to the point where we don’t really bat an eye at these numbers, when in fact we should be appalled. If you look at the “source” of these web pages, it’s tiny bits of actual content buried in a sea of <script> tags that are doing… something? Fuck if I know.

The bloat seen on the web, by the way, is being driven by more nefarious forces than sheer laziness. In addition to building a website using your favorite unnecessaryframework” that you can choose willy-nilly (which varies with every web developer you ask, and then has to be hosted on a separate CDN because your web server can’t handle the load), you also have to integrate analytics packages into your website, as requested by your marketing department, and another analytics package requested by your user research team, and another analytics package requested by your design team, etc. If one of the analytics tools goes out of fashion, leave the old code in! Who knows, we might need to switch back to it someday. It doesn’t seem to be impacting load speeds… much… on my latest MacBook Pro. The users won’t even notice.

And of course, ads. Ads everywhere. Ads that are basically free to load whatever arbitrary code they like, and are totally out of the control of the developer. Oh, you say the users are starting to use ad blockers? Let’s add more code that detects ad blockers and forces users to disable them!

The web, in other words, has become a dumpster fire. It’s a dumpster fire of epic proportions, and it’s not getting better.

What to do?

What we need is for more engineers to start looking at the bigger picture, start thinking about the long term, and not be blinded by the novelty of the latest contraption without understanding its costs. Hear me out for a second:

  • Not everything needs to be a “framework” or “library.” Not everything needs to be abstracted for all possible use cases you can dream of. If you need code to do something specific, sometimes it’s OK to borrow and paste just the code you need from another source, or god forbid, write the code yourself, rather than depending on a new framework. Yes, you can technically use a car compactor to crack a walnut, but a traditional nutcracker will do just fine.
  • Something that is clever isn’t necessarily scalable or sustainable. I already gave the example of Electron above, but another good example is node.js, whose package management system is a minor dumpster fire of its own, and whose dependency cache is the butt of actual jokes.
  • Sometimes software needs to be built from scratch, instead of built on top of libraries and frameworks that are already bloated and rotting. Building something from the ground up shouldn’t be intimidating to you, because you’re an engineer, capable of great deeds.
  • Of course, something that is new and shiny isn’t necessarily better, either. In fact, “new” things are often created by fresh and eager engineers who might not have the experience of developing a product that stands the test of time. Treat such things with a healthy bit of skepticism, and hold them to the same high standards as we hold mature products.
  • Start calling out software that is bad, and don’t use it until it’s better. As an engineer you can tell when your fellow engineers can do a better job, so why not encourage them to do better?
  • Learn to say No! When the newest JavaScript framework starts making its rounds, or when the latest “cross-platform” app development framework is unveiled, or when everyone starts talking about microservices, it’s OK to say “No!” “No, thank you!” “Not until we understand how this will be beneficial to us five years from now.” “Not until we understand the costs, in terms of space, performance, and sanity, of adopting this new thing.”

I suppose that with this rant I’m adding my voice to a growing number of voices that have similarly identified the problem and laid it out in even greater detail and eloquence than I have. I wish that more developers would write rants like this. I wish that this was required training at universities. I have a sinking feeling, however, that these rants are falling on deaf ears, which is why I’ll add one more suggestion that we, as engineers, can do to raise awareness of the issue:

Educate regular users about how great software can be. Tell your parents, your friends, your classmates, that a web page shouldn’t actually need ten seconds to load fully. Or that they shouldn’t need to purchase a new generation of laptop every two years, just to keep up with how huge and slow the software is becoming. Or that the software they install could be one tenth of its size, freeing up that much more space for their photos or documents, for example. That way, regular users can be as fed up as we are about the current state of software, and finally start demanding us to do better.

Home security with Raspberry Pi

The versatility of the Raspberry Pi seems to know no bounds. For a while I’ve been wanting to set up a DIY home security system in my house, and it turns out that the Raspberry Pi is the perfect choice for this task, and more. (The desire for a security system isn’t because we live in a particularly unsafe neighborhood or anything like that, but just because it’s an interesting technical challenge that provides a little extra peace of mind in the end.)

Camera integration

I began with a couple of IP cameras, namely the Anpviz Bullet 5MP cameras, which I mounted on the outside of the house, next to the front door and side door.  The cameras use PoE (power over Ethernet), so I only needed to route an Ethernet cable from the cameras to my PoE-capable switch sitting in a closet in the basement.

At first I assumed that I would need to configure my Raspberry Pi (3) to subscribe to the video streams from the two cameras, do the motion detection on each one, re-encode the video onto disk, and then upload the video to cloud storage.  And in fact this is how the first iteration of my setup worked, using the free MotionEye software.  However, the whole thing was very sluggish, since the RPi doesn’t quite have the horsepower to be doing decoding, encoding, and motion detection of multiple streams at once (and I didn’t want to compromise by decreasing the video quality coming from the cameras), so my final output video was less than 1 frame per second, with my RPi running at full load and getting quite warm. Definitely not a sustainable solution.

But then I realized that a much simpler solution is possible. The Anpviz cameras are actually pretty versatile themselves, and can perform their own motion detection. Furthermore, they can write the video stream directly onto a shared NFS folder!  Therefore, all I need to do is set up the RPi to be an NFS server, and direct the cameras to write to the NFS share whenever motion is detected.

And that’s exactly what I did, with a little twist:  I attached two 16 GB USB flash drives to the RPi, with each USB drive becoming an NFS share for each respective camera. That way I’ll get the maximum throughput of data from the cameras directly to USB storage. With this completed setup, the Raspberry Pi barely reaches 1% CPU load, and stays completely cool.

I wrote a Python script that runs continuously in the background and checks for any new video files being written onto the USB drives. If it detects a new file, it automatically uploads it to my Google Drive account, using the Google Drive API which turned out to be fairly easy to work with, once I got the hang of it. The script automatically creates subfolders in Google Drive corresponding to the current day of the week, and which camera the video is from. It also automatically purges videos that are more than a week old.

I have to heap some more praise onto the cameras for supporting H.265 encoding, which compresses the video files very nicely. All in all, with the amount of motion that is typical on a given day, I’m averaging about 1 GB per day of video being recorded (at 1080p resolution!), which makes 7 GB in a rolling week’s worth of video, which is small enough to fit comfortably in my free Google Drive account, without needing to upgrade to a paid tier of storage.

Water sensor

Since my Raspberry Pi still had nearly all of its processing power still left over, I decided to give it some more responsibility.

About a month ago the sewer drain in the house became clogged, which caused it to back up and spill out into the basement.  Fortunately I was in the basement while this was happening and caught it before it could do much more damage. An emergency plumber was called, and the drain was snaked successfully (turned out to be old tree roots).  However, from now on I wanted to be warned immediately in case this kind of thing happens again.

So I built a very simple water sensor and connected it to the Raspberry Pi.  In fact “very simple” is an understatement: the sensor is literally two wires, close together, which will short out if they come into contact with water.  I used some very cheap speaker wire, and routed it from the RPi to the drain from where the water can potentially spill out.

On the Raspberry Pi, one wire is connected to ground, and the other is connected to a GPIO pin with a pull-up resistor enabled. This means that if the wires are shorted out, the GPIO input will go from HIGH to LOW, and this will be an indication that water is present. The sensor is being monitored by the same Python script that monitors and uploads the camera footage, and will automatically send me an email when the sensor is triggered.

For good measure, I installed a second water sensor next to our hot water tank, since these have also been known to fail and leak at the most inconvenient times.

And that’s all for now. The Raspberry Pi still has plenty of GPIO pins left over, so I’ll be able to expand it with additional sensors and other devices in the future.

Notes

Here are just a few random notes related to getting this kind of system up and running:

Enable shared NFS folder(s)

Install the necessary NFS components:
$ sudo apt-get install nfs-kernel-server portmap nfs-common
Add one or more lines to the file /etc/exports:
/folder/path_to_share *(rw,all_squash,insecure,async,no_subtree_check,anonuid=1000,anongid=1000)
And then run the following:
$ sudo exportfs -ra
For good measure, restart the NFS service:
$ sudo /etc/init.d/nfs-kernel-server restart

Run script(s) on startup

  • Add line(s) to /etc/rc.local
  • If it’s a long-running script, or continuously-running, then make sure to put an ampersand at the end of the line, so that the boot process can continue.

Automatically mount USB drive(s) on boot

When the Raspberry Pi is configured to boot into the desktop GUI, it will auto-mount USB drives, mounting them into the /media/pi directory, with the mount points named after the volume label of the drive. However, if the Pi is configured to boot into the console only (not desktop), then it will not auto-mount USB drives, and they will need to be added to /etc/fstab:
/dev/sda1 /media/mount_path vfat defaults,auto,users,rw,nofail,umask=000 0 0

(The umask=000 parameter enables write access to the entire disk.)

Set the network interface to a static IP

Edit the file /etc/dhcpcd.conf. The file contains commented-out example lines for setting a static IP, gateway, DNS server, etc.

And lastly, here are a couple of Gists for sending an email from within Python, and uploading files to a specific folder on Google Drive.