Bug Pattern In Java - The Double Descent - Online Article

Overview

When your program recursively descends a composite data structure so that occasionally more than one step down is taken via a single recursive call, you've come upon the Double Descent pattern. We discuss ClassCastExceptions and the conceptual errors behind this bug.

Although usually easier to debug than other, more insidious erroneous behaviors, ClassCastExceptions are often symptoms of conceptual errors in recursively descending a composite data structure. We'll discuss where programmers should look to find this pattern of bug, how to recognize the pattern, and what to do to minimize its occurrence.

Unlike the dreaded NullPointerException-which says nothing about what was expected to occur instead of the null pointer, -a ClassCastException is relatively easy to debug. A ClassCastException often occurs in a program that is performing a recursive descent over a data structure. I call this the Double Descent bug pattern.

Here's the breakdown of this bug pattern:

  • Pattern: Double Descent.

  • Symptoms: A program that throws a ClassCastException while performing a recursive descent over a data structure.
  • Cause: Some part of the code is descending two levels per method call without dispatching appropriately on the second descent.
  • Cures and Preventions: Factor the casting code out into separate methods for each class. Alternatively, examine the invariants to ensure that the casts will succeed.

About This Bug Pattern

The Double Descent bug pattern manifests itself as a ClassCastException. It is caused by recursively descending a composite data structure in such a way that sometimes more than one step in the descent is taken in a single recursive call.

Descending in this way often necessitates the addition of casts to get the code to compile. But, in such a descent, it is easy to forget to check that appropriate invariants are satisfied to guarantee that these casts will succeed.

Consider the following class hierarchy for binary trees of ints. We want to allow for empty trees, so we will not put a value field into the Leaf class. Because this decision makes all Leafs identical, we can make use of the Singleton design pattern for class Leaf. This design pattern stores a single public instance of a class as a static field (to be used whenever an instance is needed) and restricts outside invocations of the class constructor.

The advantages of doing it this way?

  • It saves storage space. Why create multiple identical instances of the same class?

  • It allows us to use == in place of instanceof or equals() to check for class and object identity.

Here's the resulting code for the Tree class hierarchy:

A Class Hierarchy for Binary Trees of ints That Allows for Empty Trees.

abstract class Tree {  }
class Leaf extends Tree
{
  public static final Leaf ONLY = new Leaf();
  private Leaf()
  { }
}
class Branch extends Tree
{
  public int value;
  public Tree left;
  public Tree right;
  public Branch(int _value, Tree _left, Tree _right)
{
  this.value = _value;
  this.left = _left;
  this.right = _right;
  }
}

Now, suppose we want to add a method on Trees that determines whether any two adjacent nodes (such as a branch and one of its children) both contain 0 as their value. We might add the following methods (notice that the last method will not compile in its current form):

Adding Methods to Ferret Out 0 Values in Adjacent Nodes.

// in class Tree:
public abstract boolean hasAdjacentZeros();
// in class Leaf:
public boolean hasAdjacentZeros()
{
  return false;
}
// in class Branch:
public boolean hasAdjacentZeros()
{
  return this.value == 0 && (this.left.value == 0 || this.right.value == 0)
  || this.left.hasAdjacentZeros()
  || this.right.hasAdjacentZeros();
}

The method in class Branch will not compile because this.left and this.right are not guaranteed to have value fields.

The fact that we cannot compile strongly suggests that there is a logical problem with our manipulation of these data structures. But suppose we ignore this warning sign and simply cast this.left and this.right to Branches in the appropriate if statement, as follows:

Casting to Branches To Fix Inability to Compile.

// in class Branch:
public boolean hasAdjacentZeros()
{
  return this.value == 0 && (((Branch)this.left).value == 0 ||
((Branch)this.right).value == 0)
  || this.left.hasAdjacentZeros()
  || this.right.hasAdjacentZeros();
}

Now the code will compile. In fact, it will succeed on many test cases.

A Quick but Incomplete Fix

The quick but incorrect way to remedy the problem would be to eliminate the Leaf class and represent Leaf nodes simply by putting null values in the left and right fields of a Branch. This approach would eliminate the need to cast in the code listings above, but it would not fix the bug.

Instead, the error signaled at runtime would be a NullPointerException instead of a ClassCastException. Because NullPointerExceptions are more difficult to diagnose, this "fix" would actually decrease the quality of the code. (In fact, the resultant class hierarchy would be a Dangling Composite. For more discussion on this issue.

So, what's the best way (or ways) to fix this bug?

Note

Think of a cast as a kind of an assertion; think of the invariants as arguments for why the assertion is true.

The Real Fix

One way would be to wrap each cast in an instanceof check.

Wrapping Each Cast in an instanceof Check

public boolean hasAdjacentZeros()
{
  boolean foundOnleft = false;
  boolean foundOnRight = false;
  if (! (this.left instanceof Leaf))
  {
  // this.left instanceof Branch
foundOnLeft = ((Branch)this.left).value == 0;
  }
  if (! (this.right instanceof Leaf))
  {
  // this.right instanceof Branch
  foundOnRight = ((Branch)this.right).value == 0;
  }
  return this.valueIs(0) && (foundOnLeft || foundOnRight)
  || this.left.hasAdjacentZeros()
  || this.right.hasAdjacentZeros();
}

This is an especially helpful practice for else clauses. Because there is rarely an explicit check on the invariants that are expected to hold in an else clause, it's a good idea to make those invariants clear in your comments.

You can think of a cast as a kind of assertion and the invariants as arguments for why the assertion is true.

The disadvantage of using instanceof checks in this fashion is that, if we were to add another subclass of Tree (such as a LeafWithValue class), we would have to revise the instanceof checks. For this reason, I try to avoid instanceof checks whenever possible.

Instead, I add extra methods to the subclasses that perform the appropriate action for each subclass. After all, the ability to add such polymorphic methods is one of the key advantages of an object-oriented language.

In the current example, we could do this by adding a valueIs() method to the Tree class as follows:

Using Extra Methods to Avoid Rewriting instanceof Checks

// in class Tree:
public abstract boolean valueIs(int n);
// in class Leaf:
public boolean valueIs(int n)
{
  return false;
}
// in class Branch:
public boolean valueIs(int n)
{
  return value == n;
}
// in class Branch, method hasAdjacentZeros
public boolean hasConsecutiveZeros()
{
  return this.valueIs(0) && (this.left.valueIs(0) || this.right.valueIs(0))
  || this.left.hasConsecutiveZeros()
  || this.right.hasConsecutiveZeros();
}

Notice that we've added a valueIs() method instead of a getValue() method. If we had added a getValue() method to class Leaf, we would either have to return some sort of flag value indicating that the method application is nonsensical or actually throw an exception.

Returning a flag value would bring up many of the problems associated with the Null Flag bug pattern. Throwing an exception would not help in our case because we would then have to add instanceof checks in hasAdjacentZeros() to ensure that we didn't trigger the exception. And that is exactly what we are trying to avoid with the new method.

valueIs() avoids all of these problems by encapsulating what we really want each class to handle separately—checking whether an instance of the class contains the given value.

What We've Learned

In this article on the Double Descent bug pattern, we've learned the following:

  • If a ClassCastException is thrown as your program recursively descends a composite data structure, it usually means that a method call isn't dispatching appropriately on the second call.

  • One cure is to factor out casting code into separate methods for each class.
  • Another cure is to examine the invariants to ensure that the casts will succeed.

  • Another is to wrap each cast in an instanceof check. However, I suggest you avoid instanceof checks. Otherwise, you'll have to revise them every time you add another subclass of Tree.
  • Instead of instanceof checks, try adding extra methods to the subclasses to perform the appropriate actions for each subclass. This makes use of the polymorphic aspect of object-oriented languages, one of such languages' advantages.

  • A quick fix seems to be to get rid of the Leaf class and represent the Leaf nodes as null values in Branch's left and right fields. But this "fix" would introduce a Dangling Composite. The runtime error would throw a NullPointerException instead of a ClassCastException, a harder exception to diagnose.
  • Commenting in code is helpful to logical debugging, especially for else clauses.
  • Holding each cast to this level of scrutiny—knowing that the invariants will ensure that any casts will succeed—will lead you to eliminate many such casts.

In short, the moral to this story is to always convince yourself that the invariants inside a code block ensure that any casts in the block will always succeed. When each cast is held to this level of scrutiny, you may find yourself eliminating many of these casts by adding methods to the relevant subclasses.

About the Author:

No further information.




Comments

No comment yet. Be the first to post a comment.