My BASIC beginnings

Edsger Dijkstra was absolutely right when he said, “Programming in BASIC causes brain damage.” (Lacking a source for that quote, I found an even better quote that has a source: “It is practically impossible to teach good programming to students that have had a prior exposure to BASIC: as potential programmers they are mentally mutilated beyond hope of regeneration.”)

When I reflect on my (not-too-distant) programming infancy, I often think about what I might have done differently, like what technologies I could have learned, which ones I should have avoided, or what algorithms I could have used in old software I had written, and so on.

But there’s one thing that really stands out more than anything else: starting out with BASIC was the worst thing I could have done.

I’m not sure how useful it is to talk about this now, since BASIC has pretty much gone extinct, and rightly so, but it feels good to get it off my chest anyway.

My parents and I immigrated to the U.S. in 1991, when I was 10 years old, and I had never laid eyes on a personal computer before that time. During my family’s first few months in the States, we acquired a 80286 IBM PC, which was probably donated to us by an acquaintance (since the 80386 architecture was already prevalent at that time, and 80486 was the cutting edge).

I also happened to come across a book called BASIC and the Personal Computer by Dwyer and Critchfield. I was instantly fascinated by the prospect of programming the computer, and the boundless possibilities that computer software could provide.

However, I made a critical error that would hinder my programming development for at least a year: I reached the erroneous conclusion that BASIC was the only language there was!

I had no idea that BASIC was an interpreted language, or indeed what difference there is between an interpreted and a compiled language. I thought that all software (including the games I played, Windows 3.0, Word Perfect, etc.) was written in BASIC! This unfortunately led me down an ill-fated path of self-study, which took an even stronger effort to undo.

I learned all there was to know about BASIC programming in a few months (starting with GW-BASIC, then moving to QuickBASIC), and then I started to notice certain things about the software I was trying to write.

No matter how I tried, I couldn’t make my programs be as fast as other software I used. I couldn’t understand why this was the case. Also, the graphics routines in BASIC were virtually nonexistent, so I was baffled how anyone could write games with elaborate graphics, scrolling, and responsive controls. I was eager to start developing games that would rival my favorite games at the time, like Prince of Persia, Crystal Caves, and Commander Keen.  But the graphics and responsiveness of those games was orders of magnitude beyond what I could achieve with my BASIC programs.

With all this frustration on my mind, I was determined to find the reason why my programs were so limited. I soon found a solution, but once again it was the wrong one! I stumbled upon some example BASIC code that used assembly language subroutines (encoded as DATA lines in the BASIC program), as well as INTERRUPT routines that took advantage of the underlying DOS and BIOS services.

This led me down the path of learning Intel 286 assembly language (another few months of studying), and encoding it into my BASIC programs! This solved the issue of responsiveness, but there was still the issue of graphics, or lack thereof. Fortunately, I found a book at the local public library about VGA graphics programming. Even more fortunately, the book contained sample source code, using a language they called C….

And my eyes were open!

It hit me like a freight train. I almost burst out laughing right there at the library. I realized that I had been learning the wrong things all along! (Of course learning assembly language was sort of right, but my application of it was still misguided.)

Learning C and C++ from that point forward wasn’t particularly difficult, but I still feel like it would have been a lot easier if my mind hadn’t been polluted by the programming style and structure that I learned from BASIC. It makes me wonder how things might have been different, had I accidentally picked up a book on C++ instead of a book on BASIC during my earliest exploits with computers.

In all fairness, I’m sure I learned some rudimentary programming principles from BASIC, but I’m not sure that this redeems BASIC as a learning tool. There were just too many moments where, while learning C++, I thought, “So that’s the way it really works!” And I’m sure it’s also my fault for trying to learn everything on my own, instead of seeking guidance from someone else who might have told me, “You’re doing it wrong.”

All of this makes me wonder what programming language would be appropriate for teaching today’s generation of young programmers. Based on my comically tragic experience with BASIC, my gut instinct is to advise aspiring developers to stay away from interpreted languages (such as Python), or at the very least understand that the interpreted language they’re learning is useless for developing actual software. I don’t think there’s any harm in diving right into a compiled language (such as C++), and learning how it hugs the underlying hardware in a way that no interpreted language ever could.

That being said, I don’t wish any of this to reflect negatively on Dwyer and Critchfield’s BASIC and the Personal Computer. It’s a solid book, and I still own the original copy. There’s no denying that it was one of the first books that got me interested in programming, and for that I’m thankful. However, sometimes I regret that I didn’t find Stroustrup’s The C++ Programming Language at the same garage sale as where I found BASIC and the Personal Computer. Or, alternatively, perhaps Dwyer and Critchfield could have included the following disclaimer in large bold letters: This is not the way actual software is written! But perhaps it’s time to let it go. I didn’t turn out so bad, right?

DiskDigger now available for Android!

I’m happy to announce that DiskDigger is now available for Android devices (phones and tablets running rooted Android 2.2 and above)! You can get the app by searching for it on the Google Play Store from your Android device.  Please note that the app only works on rooted devices.

At the moment, the app is in an early Beta stage, meaning that it’s not meant to be as powerful or complete as the original DiskDigger for Windows, and is still in active development.   Nevertheless, it uses the same powerful carving techniques to recover .JPG and .PNG images (the only file types supported so far; more will follow) from your device’s memory card or internal memory.

So, if you’ve taken a photo with your phone or tablet and then deleted it, or even reformatted your memory card, DiskDigger can recover it!

I’ve written a quick guide that has more information and a brief introduction to using the app!   If you have questions, comments, or suggestions about the app, don’t hesitate to share them!

Update: thanks to Lifehacker for writing a nice article!

Correctly naming your photos

I seem to be very minimal in my strategy of organizing my digital photo collection. I have a single folder on my computer called “Pictures,” and subfolders that correspond to every year (2011, 2010, …) since the year I was born. Some of the years contain subfolders that correspond to noteworthy trips that I’ve taken.

This method makes it extremely easy to back up my entire photo collection by dragging the “Pictures” folder to a different drive. It also makes it easy to reference and review the photos in rough chronological order. This is why I’ve never understood the purpose of third-party “photo management” software, since most such software inevitably reorganizes the underlying directories in its own crazy way, or builds a proprietary index of photos that takes the user away from the actual directory structure. If you’re aware of the organization of your photos on your disk, then any additional management software becomes superfluous.

At any rate, there is one slight issue with this style of organizing photos: all of the various sources of photos (different cameras, scanners, cell phones, etc) give different file names to the photos! So, when all the photos are combined into a single directory, they often conflict with each other, or at the very least become a disjointed mess. For example, the file names can be in the form DSC_xxxx, IMG_xxxx, or something similar, which isn’t very meaningful. Photos taken will cell phones are a little better; they’re usually composed of the date and time the photo was taken, but the naming format is still not uniform across all cell phone manufacturers.

Thus, the optimal naming scheme for photos would be based on the date/time, but in a way that is common between all sources of photos. This would organize the photos in natural chronological order. The vast majority of cameras and cell phones encode the date and time into the EXIF block of each photo. If only there was a utility that would read each photo, and rename it based on the date/time stored within it. Well, now there is:

Download it now! (Or browse the source code on GitHub)

This is a very minimal utility that takes a folder full of photos and renames each one based on its date/time EXIF tag. As long as you set the time on your camera(s) correctly, this will ensure that all your photos will be named in a natural and uniform way.

The tool lets you select the “pattern” of the date and time that you’d like to apply as the file name. The default pattern will give you file names similar to “20111028201345.jpg” (for a photo taken on Oct 28 2011, 20:13:45), which means that you’ll be able to sort the photos chronologically just by sorting them by name!

The FujiFilm .MPO 3D photo format

A few weeks ago my dad, in his love for electronic gadgetry, purchased a FujiFilm FinePix REAL 3D camera. The concept is pretty simple: it’s basically two cameras in one, with the two sensors spaced as far apart as an average pair of human eyes. The coolest thing about the camera is its LCD display, which achieves autostereoscopy by using a lenticular lens (kind of like those novelty postcards that change from one picture to another when you look at them from different angles), so if it’s held at the right angle and distance from the eyes, the picture on the LCD display actually appears 3-dimensional without special glasses!

Anyway, I immediately started wondering about the file format that the camera uses to record its images (as well as movies, which it also records in 3D). In the case of videos, the camera actually uses the well-known AVI container format, with two synchronized streams of video (one for each eye). In the case of still photos, however, the camera saves files with a .MPO extension, which stands for Multiple Picture Object.

I was expecting a complex new image specification to reverse-engineer, but it turned out to be much simpler than that. A .MPO file is basically two JPG files, one after another, separated only by a few padding zeros (presumably to align the next image on a boundary of 256 bytes?). Technically, if you “open” one of these files in an image editing application, you would actually see the “first” image, because the MPO file looks identical to a regular JPG file at the beginning.

I proceeded to whip up a quick application in C# to view these files (that is, view both of the images in each file). This quick program also has the following features:

  • It has a “stereo” mode where it displays both images side by side. Using this feature you can achieve a 3D effect by looking at both images as either a cross-eyed stereogram (cross your eyes until the two images converge, and combine into one) or a relaxed-eye stereogram. You might have to strain your eyes a bit to focus on the combined image, but the effect truly appears 3-dimensional.
  • In “single” mode, the program allows you to automatically “cycle” between the two images (a wiggle-gram, if you will), which creates a cheap jittery pseudo-3D effect (see screen shots below).
  • Also in “single” mode, the program lets you save each of the frames as an individual JPEG file by right-clicking on the picture.
So, if you want a quick and not-so-dirty way of viewing your MPO files, download the program and let me know what you think! (Or browse the source code on GitHub)
Here’s a screenshot of the program in “stereo” mode:

And a screenshot of the program in “cycle” mode:

If you like, you can download the original .MPO file shown in the screenshots above.

Now for a bit of a more technical discussion…. Clearly it would be a great benefit to add support for the .MPO format to DiskDigger, the best file carving application in town.

However, from the perspective of a file carver, how would one differentiate between a .MPO file and a standard .JPG file, since they both have the same header? As it is now, DiskDigger will be able to recover the first frame of the .MPO file, since it believes that it found a .JPG file.

After the standard JPG header, the MPO file continues with a collection of TIFF/EXIF tags that contain meta-information about the image, but none of these tags seem to give a clue that this is one of two images in a stereoscopic picture (at least not the tags within the first sector’s worth of data in the file, which is what we’re really interested in).

One of the EXIF tags gives the model name of the camera, which identifies it as “FinePix REAL 3D W3.” Perhaps we can use the model name (the fact that it contains “3D”) to assume that this must be a .MPO file, but I’d rather not rely on the model name, for obvious reasons, although the FinePix is currently the only model that actually uses this format (to my knowledge).

The other option would be to change the algorithm for JPG carving, so that every time we find a JPG file, we would seek to the end of the JPG image, and check if there’s another JPG image immediately following this one. But then, what if the second JPG image is actually a separate JPG file, and not part of a MPO collection?

For the time being, DiskDigger will in fact use the model name of the camera to decide if it’s a .MPO file or just a regular .JPG file. The caveats of doing this would be:

  • It won’t identify .MPO files created by different manufacturers.
  • It might give false positive results for .JPG images shot with the camera in 2D mode.

As always, you can download DiskDigger for all your data recovery needs. And if anyone has any better ideas of how to identify .MPO files solely based on TIFF/EXIF tags, I’d love to hear them!

Update: DiskDigger now fully supports recovering .MPO files, based on deep processing of MP tags encoded in the file!

Thumbnail cache in Windows 7 / Vista – a rumination

(Note: read my newer post on this subject!)

Today I was thinking about the security implications of thumbnail caching systems on most PCs out there today. What I mean by that is this: whenever you use Windows Explorer to browse a directory that contains photos or other images, and you enable the “thumbnail view” feature, you would see a thumbnail of each of the images. By default, Windows caches these thumbnails, so that it doesn’t have to regenerate the thumbnails the next time you browse the same folder.

This has several implications in terms of privacy and security, since it means that a copy of each image is made elsewhere on the computer (albeit lower resolution), basically without the user’s knowledge. This is good news from a forensic examiner’s point of view, since the thumbnail cache can contain thumbnails of images that have long been deleted. However, from the user’s point of view, it can present a privacy/security issue, especially if the images in question are confidential or sensitive.

Windows XP caches thumbnails in the same folder as the original images. It creates a hidden file called “Thumbs.db” and stores all the thumbnails for the current folder in that file. So, even if the original images were deleted from the folder, the Thumbs.db file will still contain thumbnails that can be viewed at a later time.

However, in Windows 7 and Windows Vista, this is no longer the case. The thumbnails are now stored in a single centralized cache under the user’s profile directory: C:\Users\[username]\AppData\Local\Microsoft\Windows\Explorer\thumbcache*.db

The above directory contains multiple thumbnail cache files, each of which corresponds to a certain resolution of thumbnails: thumbcache_32.db, thumbcache_96.db, thumbcache_256.db, and thumbcache_1024.db.

So then, wouldn’t you like to find out what thumbnails your computer has cached in these files? Well, now you can! I’ve whipped up a small utility for the sole purpose of viewing the contents of these thumbnail caches:

This is probably not the first utility that does this, but it’s definitely the simplest. It automatically detects the thumbnail caches present on your computer, and lets you view all the thumbnail images in each cache.

If you want to disable the thumbnail cache in Windows 7 or Vista, you can find instructions here.