Tag Archives: tools

Why I Don’t Use Debuggers

Other notable authors have already written good posts about this topic, but I was recently encouraged to do so as well.

First, the title isn’t entirely true. I’ll break out gdb when I need to get the backtrace of a native application crash. I’ll do the equivalent for any runtime that doesn’t provide information about the method or function that produced the exception. However, I otherwise avoid them.

Debuggers make it possible to make sense of larger sets of code than you might otherwise be able to. This is helpful, but it can lead you into believing you can deal with more complexity than you can. Debuggers are a crutch that get you past some of your limitations, but when the limitations of the debuggers are reached, you may find yourself in a briar patch.

Complications

Threading and multiple processes

Stepping through multiple threads can be a bear. Threading and multi-processing in general can be dangerous to your health. I prefer concurrency models that isolate to an extreme degree any concurrency primitives. I’ve not tried but I’ve heard good things about Threading Building Blocks.

Runtime Data Models

Investigating data in a debugger may require some familiarity with runtime representations of data structures rather than interacting with your data structures via familiar interfaces.

Additionally, the runtime data structures typically don’t have standard implementations, so they can change from version to version without warning. In fact, depending on the goals of the implementation, the underlying form of the data could change run-to-run or intra-run. I tend to prefer to rely on the documented interfaces.

On the other hand, debuggers provide a good way to get familiar with runtime data structures.

Complex Failure Conditions

When using debuggers in my more youthful days, I found that complicated issues were easier to track down if some code changes were made to provide breakpoints under odd conditions. This seemed antithetical to the purpose of a debugger. Maybe I was doing it wrong…

Not Recorded

I’ve never seen anyone record a debugging session such that they could return to a particular debugger state. I’m sure it could be done, but I don’t know how valid that technique would be if small modifications were made to the original software and retried.

Debugger Unavailability

In some rare circumstances, debuggers aren’t available for the environment you’re using. If you haven’t developed any other techniques for finding problems, you may be stuck.

Preferred Techniques

Unit Testing

I cannot stress this enough: unit testing, done well, forces you to break your code into smaller, functional units. Small functional units are key to solid design. Think about the standard libraries in the language(s) you use: the standalone functionality, the complete independence of the operation of your code, the ease with which one could verify their correct functioning. That’s how you should be designing.

Note that once you know the method/function in which the fault happened, you’ve narrowed the problem down significantly. If you did a good job at functional decomposition, you’re typically only a few steps from the source of the problem. If not, judicious application of the other techniques will tease it out.

If you find that the behavior of your module is too complicated to easily write unit tests, that may be a sign that your module is too big. On rare occassion the input domain is so rich with varying behaviors that directly testing the interesting input combinations is impractical. Those can be managed with more advanced testing techniques, see: QuickCheck.

Invariant Checks

When code separation proves to be difficult because of a lack of regression testing around the parts that you’re changing, invariant checks are a technique that can be used as a temporary shim. These are tests that run inline with the code of interest to check conditions usually before and after a method/function call or a block of code.

The invariant code can form some of the foundational functions when you do eventually get around to creating the unit tests for your new module.

Print/Log Debugging

The dreaded printf() debugging! This isn’t ideal, but it can give some quick info about a problem without much fuss. If you find you’re adding hundreds of printf()’s to track something down, I might suggest that you’re using some of the other techniques inadequately. If this is the position you’re in, a debugger might actually be a benefit.

Note that, again, all code written to try to weed out your problem may be valuable for one or more tests.

I’ve also used ring buffers embedded in devices that store the last 1000 or so interesting events so that when an obscure failure happens, there exists some record of what the software was doing at the time.

Where Debuggers Shine

Compiler debugging. This is an absolute pain without debuggers, and I wouldn’t recommend it.

Heisenbugs: Those bugs that disappear with any change to the code. Then a debugger is about the only way to attempt to get a look at it. Those bugs usually warrant pulling out all the stops. Fortunately, many of the newer languages have entirely eliminated these bugs. Good riddance.

Other Tools

I do appreciate other tools like code checkers and profilers. They usually work without much input and communicate their results in terms of the language they’re checking. I’m a fan of this model.

A seemingly close relative of debuggers, the REPL tools, look promising. I’ve never used them, but they look like they operate almost entirely in terms of the language that they’re debugging. I’m a fan of this model.

Summary

I prefer debugging techniques that produce additional useful work and provide support for refactoring if the situation warrants it. Every bug potentially reveals a need to re-implement or re-design. Debuggers feel more like patchwork.

Basics: Revision Control

Engineering disciplines have been using revision control techniques to manage changes to their documents for decades if not centuries.

With computers we get new, automated, and more comprehensive techniques for doing this.

Advantages

You have a history of changes. Where did that file go? Look through the history. What changed between this revision and the last? Some revision control tools will show you exactly what changed.

They can coincide with and support your backups, some completely passive. Depending on the frequency with which you do backups, you may already effectively have a form of revision control, though it might be difficult to get some of the related features.

For engineers and other professions, it’s sometimes incredibly important to know what you were seeing at a particular point in time. Some revision control tools give you exactly that.

Disadvantages

Space. Recording revisions usually requires extra disk space. Disk space is relatively cheap, and most revision control systems are far more effective than recording multiple full copies of the same file(s).

Complexity. Controlling revisions isn’t as simple as just writing contiguous bytes to disk. Unfortunately, filesystems aren’t generally as simple as just writing contiguous bytes to disk either. Computing capabilities are solid enough that this is usually a minor concern.

Classification

There are probably two broad categories into which revision control tools can be divided, passive and active. Each has its own advantages and disadvantages.

Passive

As the heading suggests, these revision control systems require minimal interaction to make them do their job.

Some passive revision control systems include:

  1. Dropbox – Look at the web UI. You can easily find older revisions of files that you’ve stored.
  2. ownCloud – This is an open source, self-hosted web tool similar to DropBox. It’s supported on all major operating systems and has apps for every major mobile OS. I use this at home.
  3. Apple Time Machine – Apple provides a way to periodically backup to a secondary drive and provides the revision history of those files. There are similar tools for Windows.
  4. Copy-on-Write Filesystems (CoW) – Several filesystems offer revisioning as a core capability based on their underlying data model.

Most of these will not record every change but will instead catch changes that fall below some roughly-defined frequency. Revision recording throughput is affected by the number of files that have changed and the total amount of extra data that would need to be stored. Because these snapshots are taken without user intervention, there’s really no chance to augment the changes with additional information or to group related changes in meaningful ways.

However, with just a little setup — pointing at a server or drive, entering account credentials, choosing a directory to sync — you can rest assured that changes made to your files will be recorded and available in the event of emergency or curiosity.

I believe every computer shipped should come with some form of passive revision controlling backup system out-of-the-box.

Active

Active revision control systems offer much more capability at the expense of learning curve. However, there’s simply no better way of working on digital (is there any other?) projects.

Some active revision control systems include:

  1. Subversion – an old favorite, but slowing deferring to
  2. git – distributed tool that’s consuming the world
  3. cvs – “ancient” predecessor to subversion

This list is far from exhaustive. I can think of at least five or six others off the top of my head, but I don’t think any others have nearly the significance today.

The features that the various active revision control tools offer are vastly varied, but they all provide a core set of functionality that distinguish them from the passive revision control tools.

  • Explicit commits. Every bit of work committed to the revision control system is done so explicitly. Specific changes can usually be grouped. Comments and other meta-data can be included with the commits to provide extra context for the change.
  • Change diffs. Every modification made can be compared to the previous version and changes between versions can be viewed.
  • View history. Every commit and meta-data ever made can be listed.
  • Checkout previous revisions. It can often be helpful to look back in time to find out why a problem didn’t seem to exist in the past or to determine when it was introduced. In rarer circumstances, you might want to know why a problem seemed to disappear.
  • Revert commits or to a previous revision. Sometimes changes were committed that were ultimately detrimental and should be removed.
  • Multi-user commits. Virtually all active revision control systems support accepting work from multiple users with techniques for merging changes that can’t be trivially combined.

Like the passive revision control systems, not all active revision control systems are also backups. In most cases you would need to take extra steps to backup the revision control system. Pairing an active revision control system with a passive revision control system could be a way to do this.

Few, if any, of the active revision control systems handle binary data well. They can usually be handled, but the efficiency of storage might be lacking and the diff capability is usually absent. This might be their single largest weakness.

No significant (or insignificant) project should be started without one of these revision control tools, and project tools should be structured in a way that allows independent, verifiable revision control.

Visualization

Most of these tools don’t seem to provide much at-a-glance functionality, and I think it’s really useful to have. The things I’m most interested in seeing:

  1. Have any files been modified (active)?
  2. Are there files that could be lost (active)?
  3. Are there any upstream changes that aren’t synced (active, especially useful for multi-user projects)?
  4. Are there any local files that haven’t been recorded (passive)?

For the active revision control questions, tools like Github Desktop for Mac and Windows, TortoiseGit and TortoiseSVN for Windows, and RabbitVCS integration for Nautilus on Linux might do the trick. Some active revision control systems provide these features out-of-the-box, but they tend to be pricey.

On (4), I’ve not seen a passive system that provides this information. It seems like it might be useful to know if all local files have synced before shutting down for a while. I’ll keep my eye out for this.

For those with a bash habit, I have a version of ls that provides (1) and (2) above. I plan to make this available shortly.