Bug Patterns In Java - The Run-On Initialization - Online Article

Overview

When you get a NullPointerException and you suspect that class-definition constructors don't take enough arguments to initialize all the class's fields, you may be looking at the Run-On Initializer pattern. We discuss initializing all constructor fields with special classes and including multiple constructors.

For various reasons, mostly bad, you will often see class definitions in which the class constructors don't take enough arguments to properly initialize all the fields of the class. Such constructors require client classes to initialize instances in several steps (setting the values of the uninitialized fields) rather than with a single constructor call.

Initializing an instance in this way is an error-prone process that I refer to as a Run-On Initialization. The types of bugs that result from this process have similar symptoms and remedies, so we can group them together into the Run-On Initializer bug pattern.

Here is this pattern in a nutshell:

  • Pattern: Run-on Initializer.
  • Symptoms: A NullPointerException at the point that one of the uninitialized fields is accessed.
  • Cause: A class whose constructors don't initialize all fields directly.
  • Cures and Preventions: Initialize all fields in a constructor. Use special classes for default values when better values can't be used. Include multiple constructors to cover cases where better values can be used. When your hands are tied, at least include an isInitialized() method.

About This Bug Pattern

In this bug pattern, several steps are necessary in order to initialize an instance of a class. As a result, initialization is more prone to error. If one of the fields is not initialized, a NullPointerException can result.

The Symptoms and the Cause

This pattern is indicated by a NullPointerException at the point that one of the uninitialized fields is accessed; there is a class whose constructors don't initialize all fields directly. For example, consider the following code:

Unfortunately, the initialization sequence for an instance of this class is prone to bugs. You may have noticed that an exception is thrown in the second initialization step. As a result, the field that should have been set after that step is not set.

But a handler for the thrown exception may not know that the field was not set. If, in the process of recovering from the exception, it accesses the value field of the RestrictedInt in question, it may trip over a NullPointerException itself.

If that happens, we are in worse shape than we would be if the handler weren't there at all. At least the checked exception contained some clue about its cause. But NullPointerExceptions are notoriously difficult to diagnose because they (necessarily) contain very little information as to why a value was set to null in the first place. Furthermore, they occur only when the uninitialized field is accessed. That access will probably occur far away from the cause of the bug—that is, from the failure to initialize the field in the first place.

There are, of course, other errors that can occur from run-on initialization bugs. For instance:

  • The programmer writing the initialization code may forget to put in one of the initialization steps.
  • There may be an order-based dependence in the initialization steps that is unknown to the programmer, who therefore executes the statements out of order.
  • The class being initialized might change. New fields might be added, or old ones removed. As a result, all the initialization code in every client must be modified to set the fields appropriately.

Because of all the problems involved with Run-On Initialization, it's much better to define constructors that initialize all fields. There's never a good reason to include a constructor for a class that leaves any of the fields uninitialized. When writing classes from scratch, that's not a difficult principle to follow.

A Simple Run-On Initialization

class RestrictedInt
{
public Integer value;
public boolean canTakeZero;
public RestrictedInt(boolean _canTakeZero)
{
canTakeZero = _canTakeZero;
}
public void setValue(int _value) throws CantTakeZeroException
{
if (_value == 0)
{
if (canTakeZero)
{
value = new Integer(_value);
}
else
{
throw new CantTakeZeroException(this);
}
}
else
{
value = new Integer(_value);
}
}
}
class CantTakeZeroException extends Exception
{
public RestrictedInt ri;
public CantTakeZeroException(RestrictedInt _ri)
{
super("RestrictedInt can't take zero");
ri = _ri;
}
}
class Client
{
public static void initialize() throws CantTakeZeroException
{
RestrictedInt ri = new RestrictedInt(false);
ri.setValue(0);
}
}
Tip

There's never a good reason to include a constructor for a class that leaves any of the fields uninitialized.

Cures and Preventions

Let's look at several things you can do to help eliminate this type of initializer.

For Legacy Code

What if you must work with a large codebase in which a class doesn't initialize all of its fields in the constructors—a codebase littered with run-on initializers?

Unfortunately, many programmers find themselves working with legacy codebases in which a class doesn't initialize all of its fields in the constructors more often than they'd like. If the legacy codebase is large and the offending class has many clients, you may not want to modify the constructor signatures, especially if the unit tests over the code are scant. Inevitably, you'll break undocumented invariants.

Often the best thing to do is to throw out that legacy code and start fresh! That may sound like crazy talk, but the time you'll spend patching up bugs in code like that can easily dwarf the time it would take to rewrite it. Many times, I have struggled to work with large bases of legacy code with problems like this, and ultimately I come away wishing I had just started fresh.

But if throwing the code away is not an option, we can still attempt to control the potential for errors by incorporating the following simple practices:

  • Initialize the fields to non-null, default values.
  • Include extra constructors.
  • Include an isInitialized() method in the class.
  • Construct special classes to represent the default values.

Let's take a look at why we should follow these practices.

Initialize the Fields to Non-Null, Default Values

By filling in the fields with default values, you help to ensure that instances of your class will be in a well-defined state at all times. This practice is particularly important for fields of reference type that will take on the null value unless you specify otherwise.

Why? Because gratuitous uses of null values inevitably result in NullPointerExceptions. And NullPointerExceptions are bad. For one thing, they provide very little information about the true cause of a bug. For another, they tend to be thrown very far away from the actual cause of the bug. Avoid them at all costs.

And if you decide you want to use null values so that you can signal that the class is not yet completely initialized.

Tip

Remember, gratuitous uses of null values inevitably result in NullPointerExceptions. And NullPointerExceptions are bad.

Include Extra Constructors

When you include additional constructors, you can use them in new contexts, where you don't have to include new run-on initializations. Just because some contexts are forced to use this bad code, other contexts shouldn't have to pay the price.

Place an isInitialized() Method in the Class

You can include an isInitialized() method in the class to allow for quick determination as to whether an instance has been initialized. Such a method is almost always a good idea when working with classes that require run-on initialization.

In cases in which you don't maintain these classes yourself, you can even put such isInitialized() methods into your own utility class. After all, if there is a consequence of an instance not being initialized that is observable from the outside, you can write a method to check for this consequence (even if it entails using the usually ill-advised practice of catching a RuntimeException).

Construct Special Classes to Represent the Default Values

Instead of allowing the fields to be filled in with null values, construct special classes (most likely with Singletons) to represent the default values. Then fill instances of these classes into your fields in the default constructor. Not only will you decrease the chances of a NullPointerException, but you will be able to control precisely which error does occur if these fields are accessed inappropriately.

For example, we could modify the RestrictedInt class as follows:

RestrictedInts with NonValues

class RestrictedInt implements SimpleInteger
{
public SimpleInteger value;
public boolean canTakeZero;
public RestrictedInt(boolean _canTakeZero)
{
canTakeZero = _canTakeZero;
value = NonValue.ONLY;
}
public void setValue(int _value) throws CantTakeZeroException
{
if (_value == 0)
{
if (canTakeZero)
{
value = new DefaultSimpleInteger(_value);
}
else
{
throw new CantTakeZeroException(this);
}
}
else
{
value = new DefaultSimpleInteger(_value);
}
}
public int intValue()
{
return ((DefaultSimpleInteger)value).intValue();
}
}
interface SimpleInteger {}
class NonValue implements SimpleInteger
{
public static NonValue ONLY = new NonValue();
private NonValue() {}
}
class DefaultSimpleInteger implements SimpleInteger
{
private int value;
public DefaultSimpleInteger(int _value)
{
value = _value;
}
public int intValue()
{
return value;
}
}

Now, if any of your client classes that access this field were to perform an intValue() operation on the resulting element, they would first have to cast to a DefaultSimpleInteger, since NonValues don't support that operation.

The advantage of this approach is that you'll be constantly reminded (with compiler errors) at every point in the code where you forgot to cast that this method call doesn't work on the default value. Also, if at runtime you happen to access this field and it contains the default value, you'll get a ClassCastException, which will be much more informative than a NullPointerException—the ClassCastException will tell you not only what was actually there, but what the program expected to be there as well, and it will occur when the cast is attempted rather than at some later point in the execution when you dereference the value.

The disadvantage is that you'll pay in performance. Every time the field is accessed, the program will also have to perform a cast.

If you're willing to do without the compilation error messages, another solution is to include the intValue() method in interface SimpleInteger. You can then implement this method in the default class with a method that throws whatever error you'd like (and you can include any information that you'd like in the error). To illustrate this, look at the following example:

NonValues That Throw Exceptions

class RestrictedInt implements SimpleInteger
{
public SimpleInteger value;
public boolean canTakeZero;
public RestrictedInt(boolean _canTakeZero)
{
canTakeZero = _canTakeZero;
value = NonValue.ONLY;
}
public void setValue(int _value) throws CantTakeZeroException
{
if (_value == 0)
{
if (canTakeZero)
{
value = new DefaultSimpleInteger(_value);
}
else
{
throw new CantTakeZeroException(this);
}
}
else
{
value = new DefaultSimpleInteger(_value);
}
}
public int intValue()
{
return value.intValue();
}
}
interface SimpleInteger
{
public int intValue();
}
class NonValue implements SimpleInteger
{
public static NonValue ONLY = new NonValue();
private NonValue() {}
public int intValue()
{
throw newRuntimeException("Attempt to access an int from a NonValue");
}
}
class DefaultSimpleInteger implements SimpleInteger
{
private int value;
public DefaultSimpleInteger(int _value)
{
value = _value;
}
public int intValue()
{
return value;
}
}

This solution can provide even better error diagnostics than the ClassCastException. It's also more efficient, because no cast is required at runtime. But this solution won't require you to think about the possible values of the field at every access point.

Which solution is best? That depends partly on your personal style and partly on the performance constraints of your project.

Including Methods That Only Throw Exceptions

At first, this practice may strike you as inherently wrong and counterintuitive—after all, a class should only contain methods that actually make sense to perform on the data, right? Including classes such as these can be particularly confusing when you are teaching programmers about object-oriented programming.

Lists with No Universal getters

abstract class List {}
class Empty extends List {}
class Cons extends List
{
Object first;
List rest;
Cons(Object _first, List _rest)
{
first = _first;
rest = _rest;
}
public Object getFirst()
{
return first;
}
public List getRest()
{
return rest;
}
}

Lists with getters in the Interface

abstract class List
{
public abstract Object getFirst();
public abstract Object getRest();
}
class Empty extends List
{
public Object getFirst()
{
throw new RuntimeException("Attempt to take first of an empty list");
}
public List getRest()
{
throw new RuntimeException("Attempt to take rest of an empty list");
}
}
class Cons extends List
{
Object first;
List rest;
Cons(Object _first, List _rest)
{
first = _first;
rest = _rest;
}
public Object getFirst()
{
return first;
}
public List getRest()
{
return rest;
}
}

For a programmer new to object-oriented languages, the motivations behind the first version of List (the one with no universal getters) will be less confusing. Your intuition tells you that a class shouldn't contain a method unless that method does real work. But the above considerations for dealing with default classes apply equally well to this example too.

It can be quite cumbersome to continually insert casts in your code; the code can become quite wordy. Additionally, the class casts can have significant repercussions in terms of performance, especially for an often-called utility class like List.

As with all design practices, this practice is best applied with a consideration for the underlying motivation of the practice. The motivation won't always be applicable; when it isn't, the practice shouldn't be used.

You're Better Off Fixing Them

You might have noticed that the Run-On Initializer bug is a bit different from some of the other patterns we've worked with. This time we've provided quite a few ideas on how to work around the root causes, rather than just fixing them outright. That's because, on many occasions, you will have to work around them.

As the considerations we've mentioned indicate, it is far better to avoid run-on initializations altogether. But when you have to deal with them, at least you can protect yourself.

What We've Learned

In this article on the Run-On Initializer bug pattern we've learned the following:

  • This bug is caused by class definitions in which the class constructors don't take enough arguments to properly initialize all the fields of the class.
  • A potential error: The programmer writing the initialization code may forget to put in one of the initialization steps.
  • Another potential error: There may be an order-based dependence in the initialization steps, unknown to the programmer, who therefore executes the statements out of order.
  • Yet another error: The class being initialized might change. New fields might be added, or old ones removed.
  • A sure prevention: It's much better to define constructors that initialize all fields.
  • If you encounter legacy code infected with this bug, often the best thing to do is to throw out that legacy code and start fresh!
  • However, when you must work with a legacy codebase in which a class doesn't initialize all of its fields in the constructors, you may not want to modify the constructor signatures, especially if the unit tests over the code are scant.
  • When you must work with legacy code, attempt to control the errors by (1) initializing the fields to non-null, default values; (2) including extra constructors; (3) including an isInitialized() method in the class; and (4) constructing special classes to represent the default values.
  • A good practice is to fill in the fields with default values; you help to ensure that instances of your class will be in a well-defined state at all times.
  • Remember, gratuitous uses of null values inevitably result in NullPointerExceptions—and those are bad.
  • Include additional constructors. You can use them in new contexts where you don't have to include new run-on initializations.
  • An isInitialized() method in the class will allow for quick determination as to whether an instance has been initialized.
  • Construct special classes to represent the default values. The advantage is that you'll be constantly reminded with compiler errors (at every point in the code where you forgot to cast) that this method call doesn't work on the default value. The disadvantage: a performance hit each time the field is accessed (as a cast is needed).
  • There's never a good reason to include a constructor for a class that leaves any of the fields uninitialized.

About the Author:

No further information.




Comments

No comment yet. Be the first to post a comment.