Debugging Wrap Up

Two more trustworthy rules for debugging are to keep an audit trail and to only declare a bug fixed if you fixed it. The first rule is self explanatory. Just make sure you use a consistent format for all your logging. That will help later when you try to filter all the audit data.

If you think you have a fix, remove it and see if the problem comes back. If not, then you probably have not fixed the problem. You have to be sure of yourself here.

Here are some last ideas. Try and get a diagram of the whole system you are debugging. Feel free to ask for help. When you do get help, you should request that assistants just report the symptoms. You don't need theories. Good luck with your debugging. I have made a career out of this activity myself.

Details are King

Here are more tips to debug a problem. Keep testing older versions of the software until the problem goes away. Now you have narrowed down the build when the problem was first introduced. It should be easier to determine what changed at that time.

There are many instances where the smallest detail is the key to the problem. Let me tell you a story. My customer's acceptance team keep making the application crash. They said it happened at random. That never happened to me or any internal testers on my project. So I enlisted the help of the people experiencing the trouble.

I asked the testers to notice everything that happened before and after the problem. What was the weather like outside? Were the lights on? Did you sit close to the desk? Was there anybody sitting next to you? What time did it happen? You know. I needed to know all the facts, even if they seemed insignificant. Guess what? This helped us crack the case.

One tester found that when she swiveled her chair to the right, the application sometimes broke. Also when she dragged her spiral notebook from left to right the problem always happened. This keep attention to details helped her identify the cause of the problem. Each time she was doing these movements, she was pressing the space key on the keyboard. Sure enough that was the cause of the application aborting. Amazing.

One Thing at a Time

Here is a guideline for debugging. Only change one variable at a time. I am not talking about programming variables. I am talking about one factor in the state of the system.

This can be taken a bit further as well. Make a change and check the result. If the bug was not fixed, undo the change and make exactly one more change. Repeat. The converse is to fix just one thing at a time as well. Then you can go on to other bugs.

Now let's talk a little about tools. Don't throw them away even if they seem like one time tools. You should actually build the tools into the initial design of the system. Make it easy to inspect the internal state of the system during runs.

Next time I will talk about audit trails and logging in general.

Say No To Guessing

Here is a pseudo rule for debugging: Don't make guesses. What you should do instead is to compare runs that exhibit the problem with those that do not. That way you can narrow down the exact effects which contribute to the problem. This leads us to the Divide and Conquer rule I wrote about previously.

Theories about what the problem might be are not worth much. In fact, the can be counter productive. You will most likely waste time tyring to check whether your theory is correct or not. You end up right where you started, except it is much later in the day.

Now this is not to say that you should never harbor a guess at the source of a bug. You should just limit the amount of guesses you make. Once you get good at debugging, you can use few guesses to get to the source of the problem.

Stay tuned for a commentary on the use of tools for debugging. I will leave you with another debugging gem. Change but one variable each time you run a test. This will help focus your cause and effect research.

Hints for Debugging

Here are some more hints and tips to debug problems in systems. Let's start with some more general rules for debugging:
  • Duplicate the problem
  • Do not think too much
  • Narrow down and focus
Ok let's now get back to the tips and tricks. You got to really know the tools you use. That includes knowing the limitations of the tool set.

Don't guess. Act like a detective and look at clues. Once you think you have a solution, run the system with the same conditions that exhibited the problem and demonstrate that the specific problem was indeed fixed.

Record each step required to make the problem occur. Do not fake the problem. Actually make it happen like it does in real use. You might need to employ automation to speed up the process of creating the bug. That is fine.

I will continue next time with some more general rules for debugging. You know I will come back to tools. And I will examine cause and effect.

Understanding the System

You must understand the system before you can begin debugging it. Here is a novel thought. Read the instructions. This means you should go over the specification for the system. You can also read source code comments. Review the design documentation to gain understand.

Now you must understand one other thing which reading the documentation. You need to take what you read with a grain of salt. Don't trust everything you read.

On the other hand, you really need to know the contents of the user manual. Read it many times until you know it well. You must have a firm grasp on the fundamentals of the system. When I come back, we will get to the second rules of debugging - Make It Fail.

Debugging

Halfway though reading a book on debugging, I realized that I had read it before. It was a good book. So I completed it for a second time. There were a lot of good stories and examples.

The book was written in 2002. However the rules it presented were timeless. The author was actually a hardware/electronics dude. But he nailed the software debugging process down well.

The goal of the advice in the book is to help you find out what is wrong with a system quickly. If you fix your commercial software first, you get a competitive advantage. That translates to more money for your company. So you had better listen up.

When you figure problems out fast, you get to go home quicker too. I know this all too well. If I do not solve problems, I am there to midnight, and I have to return on Saturday morning.

It is difficult to find debuggers who follow the rules in this book. That means if you do learn and use these rules, you will be valuable. Let's first define some terms. Troubleshooting is trying to find out what is broken. Debugging is finding out why. This book is about debugging.

Let's get on with it. The first rule is "understand the system". There are eight more rules. I will be sharing some of these in future posts. Until them, happy hunting.