Browse in : |
All
> Topics
> Programming
All > Journals > CVu > 275 Any of these categories - All of these categories |
Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.
Title: Bug Hunting
Author: Martin Moene
Date: 09 November 2015 08:54:08 +00:00 or Mon, 09 November 2015 08:54:08 +00:00
Summary: Pete Goodliffe looks for software faults.
Body:
If debugging is the process of removing software bugs, then programming must be the process of putting them in.
~ Edsger Dijkstra
It’s open season; a season that lasts all year round. There are no permits required, no restrictions levied. Grab yourself a shotgun and head out into the open software fields to root out those pesky varmints, the elusive bugs, and squash them, dead.
OK, reality is not as saccharin as that. But sometimes you end up working on code in which you swear the bugs are multiplying and ganging up on you. A shotgun is the only response.
The story is an old one, and it goes like this: Programmers write code. Programmers aren’t perfect. The programmer’s code isn’t perfect. It therefore doesn’t work perfectly the first time. So we have bugs.
If we bred better programmers, we’d clearly breed better bugs.
Some bugs are simple mistakes that are obvious to spot and easy to fix. When we encounter these, we are lucky.
The majority of bugs – the ones we invest hours of effort tracking down, losing our follicles and/or hair pigment in the search – are the nasty, subtle issues. These are the odd, surprising interactions; the unexpected consequences of our algorithms; the seemingly non-deterministic behaviour of software that looked so very simple. It can only have been infected by gremlins.
This isn’t a problem limited to newbie programmers who don’t know any better. Experts are just as prone. The pioneers of our craft suffered; the eminent computer scientist Maurice Wilkes [1] wrote:
I well remember [...] on one of my journeys between the EDSAC room and the punching equipment that ‘hesitating at the angles of stairs’ the realisation came over me with full force that a good part of the remainder of my life was going to be spent in finding errors in my own programs.
So face it. You’ll be doing a lot of debugging. You’d better get used to it. And you better get good at it. (At least you can console yourself that you’ll have plenty of chances to practice.)
An economic concern
How much time do you think is spent debugging? Add up the effort of all of the programmers in every country around the world. Go on, guess.
A staggering $312 billion per year is spent on the wage bills for programmers debugging their software. To put that in perspective, that’s two times all Eurozone bailouts since 2008! This huge, but realistic, figure comes from research carried out by Cambridge University’s Judge Business School. [2]
You have a responsibility to fix bugs faster: to save the global economy. The state of the world is in your hands.
It’s not just the wage bill, though. Consider all the other implications of buggy software: shipping delays, cancelled projects, the reputation damage from unreliable software, and the cost of bugs fixed in shipping software.
An ounce of prevention
It would be remiss of an article about debugging to not stress how much better it is to actively prevent bugs manifesting in the first place, rather than attempt a post-bug fix. An ounce of prevention is worth a pound of cure. If the cost of debugging is astronomical, we should primarily aim to mitigate this by not creating bugs in the first place.
This, in a classic editorial sleight of hand, is material for a different article, and so we won’t investigate the theme exhaustively here. Do remember how important it is to expect the unexpected and to always work with your brain fully engaged!
Suffice to say, we should always employ sound engineering techniques that minimise the likelihood of unpleasant surprises. Thoughtful design, code review, pair programming, and a considered test strategy (including TDD practices and fully automated unit test suites) are all of the utmost importance. Techniques like assertions, defensive programming, and code coverage tools will all help minimise the likelihood of errors sneaking past.
We all know these mantras. Don’t we? But how diligent are we in employing such tactics?
Avoid injecting bugs into your code by employing sound engineering practices. Don’t expect quickly hacked-out code to be of high quality.
The best bug-avoidance advice is to not write incredibly ‘clever’ (which often equates to complex) code. Brian Kernighan states: “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.†Martin Fowler reminds us: “Any fool can write code that a computer can understand. Good programmers write code that humans can understand.â€
Bug hunting
Being realistic, no matter how sound your code-writing regimen, some of those pernicious bugs will always manage to squeeze through the defences. Donald Knuth once wrote: “Beware of bugs in the above code; I have only proved it correct, not tried it.â€
The programmer will always be required to don their hunting cap and an anti-bug shotgun.
How should we go about finding and eliminating bugs? This can be a Herculean task, akin to finding a needle in a haystack. Or, more accurately, a needle in a needle stack.
Finding and fixing a bug is like solving a logic puzzle. Generally the problem isn’t too hard when approached methodically; the majority of bugs are easily found and fixed in minutes. However, some are nasty and take longer. Those hard bugs are few in number, but given their nature, that’s where we will spend most of our time.
Two factors usually determine how hard a bug is to fix:
- How reproducible it is.
- The time between the cause of the bug entering the code, the ‘software fault’ itself – the bad line of code, or faulty integration assumption – and you actually noticing.
When a bug scores highly on both counts, it’s almost impossible to track down without sharp tools and a keen intellect. There are a number of practical techniques and strategies we can employ to solve the puzzle and locate the fault.
The first, and most important thing, is to methodically investigate and characterise the bug. Give yourself the best raw material to work with:
- Reduce it to the simplest set of reproduction steps possible. This is vital. Sift out all the extraneous fluff that isn’t contributing to the problem, and only serves to distract.
- Ensure that you are focusing on a single problem. It can be very easy to get into a tangle when you don’t realise you’re conflating two separate – but related – faults into one.
- Determine how repeatable the problem is. How frequently do your repro steps demonstrate the problem? Is it reliant on a simple series of actions? Does it depend on software configuration or the type of machine you’re running on? Do peripheral devices attached make any difference? These are all crucial data points in the investigation work that is to come.
In reality, when you’ve constructed a single set of reproduction steps, you really have won most of the battle.
So let’s look at some of the most useful debugging strategies...
Lay traps
You have errant behaviour. You know a point when the system seems correct; maybe it’s at start-up, but hopefully a lot later through the reproduction steps. You can get it to a point where its state is invalid. Find places in the code path between these two points, and set traps to catch the fault.
Add assertions or tests to verify the system invariants – the facts that must hold for the state to be correct.
Add diagnostic printouts to see the state of the code so you can work out what’s going on.
As you do this, you’ll gain a greater understanding of the code, reasoning more about its structure, and will likely add many more assertions to the mix to prove your assumptions hold. Some of these will be genuine assertions about invariant conditions in the code, others will be assertions relevant to this particular run. Both are valid tools to help you pinpoint the bug. Eventually a trap will snap, and you’ll have the bug cornered.
Assertions and logging (even the humble printf
) are potent debugging tools. Use them often.
Diagnostic logs and assertions may be valid to leave in the code after you’ve found and fixed the problem. But be careful you don’t litter the code with useless logging that hides what’s really going on, making unnecessary debug noise.
Learn to binary chop
Aim for a binary chop strategy, to focus in on bugs as quickly as possible.
Rather than single-stepping through code paths, work out the start of a chain of events, and the end. Then partition the problem space into two, and work out if the middle point is good or bad. Based on this information, you’ve narrowed the problem space to something half the size. Repeat this a few times, and you’ll soon have honed in on the problem.
This is a very powerful approach – allowing you to get to a solution in order O(log n) time, rather than O(n). That is significantly faster.
Binary chop problem spaces to get results faster.
Employ this technique with trap laying. Or with the other techniques described next.
Employ software archaeology
Software archaeology describes the art of mining through the historical records in your version control system. This can provide an excellent route into the problem; it’s often a surprisingly simple way to hunt a bug.
Determine a point in the near past of the codebase when this bug didn’t exist. Armed with your reproducible test case, step forward in time to determine which code changeset caused the breakage. Again, a binary chop strategy is the best bet here. (The git bisect
tool automates this binary chop for you, and is worth keeping in your toolbox if you’re a Git user.)
Once you find the breaking code change, the cause of the fault is usually obvious, and the fix is self-evident. (This is another compelling reason to make series of small, frequent, atomic check-ins, rather than massive commits covering a range of things at once.)
Test, test, test
As you develop your software, invest time to write a suite of unit tests. This will not only help shape how you develop and verify the code you’ve initially written. It acts as a great early warning device for changes you make later; much like the miner’s canary, the test fails long before the problem becomes complex to find and expensive to fix.
These tests can also act as great points from which to begin debugging sessions. A simple, reproducible unit test case is a far simpler scaffold to debug than a fully running program that has to spin up and have a series of manual actions run to reproduce the fault. For this reason, it’s advisable to write a unit test to demonstrate a bug, rather than start to hunt it from a running ‘full system’.
Once you have a suite of tests, consider employing a code coverage tool to inspect how much of your code is actually covered by the tests. You may be surprised. A simple rule of thumb is: if your test suite does not exercise it, then you can’t believe it works. Even if it looks like it’s OK now, without a test harness then it’ll be very likely to get broken later.
Untested code is a breeding ground for bugs. Tests are your bleach.
When you finally determine the cause of a bug, consider writing a simple test that clearly illustrates the problem, and add it to the test suite before you really fix the code. This takes genuine discipline, as once you find the code culprit, you’ll naturally want to fix it ASAP and publish the fix. Instead, first write a test harness to demonstrate the problem, and use this harness to prove that you’ve fixed it. The test will serve to prevent the bug coming back in the future.
Next time
These are by no means no the only debug strategies. In the next article we’ll cover some other useful debugging strategies. Until then, may all your bugs be easy to find...
Questions
- Assess how much of your time you think you spend debugging. Consider every activity that isn’t writing a fresh line of code in a system.
- Do you spend more time debugging new lines of code you have written, or on adjustments to existing code?
- Does the existence of a suite of unit tests for existent code change the amount of time you spend debugging, or the way you debug?
- What other bug-hunting strategies do you find valuable?
- Is it realistic to aim for bug-free software? Is this achievable? When is it appropriate to genuinely aim for bug-free software? What determines the optimal amount of ‘bugginess’ in a product?
Reference
[1] Maurice Wilkes, Memoirs of a Computer Pioneer (Cambridge, MA: The MIT Press, 1985)
[2] ‘Cambridge University Study States Software Bugs Cost Economy $312 Billion per Year’ – http://undo-software.com/company/press/press-release-8
Notes:
More fields may be available via dynamicdata ..