2015-10-21

The Mother of All Bugs

Summary: This is a story about the most elusive and sinister software bug I ever came across in my decades-long career as a programmer.

The Mother of All Bugs

At some point early in my career I was working for a company that was developing a hand-held computer for the area of Home Health Care. It was called InfoTouch™. The job involved daily interaction with the guys in the hardware department, which was actually quite a joy, despite the incessant "It's a software problem!" -- "No, it's a hardware problem!" arguments, because these arguments were being made by well-meant engineers from both camps, who were all in search of the truth, without egoisms, vested interests, or illusions of infallibility. That is, in true engineering tradition.

During the development of the InfoTouch, for more than a year, possibly two, the device would randomly die for no apparent reason. Sometimes it would die once a day, other times weeks would pass without a problem. On some rare occasions it would die while someone was using it, but more often it would die while sleeping, or while charging. So, the problem seemed to be completely random, and no matter how hard we tried we could not find a sequence of steps that would reproduce it.