Books / Patterns for Beginning Programmers / Chapter 23
Subarrays
One of Java’s most convenient features is the length
attribute
associated with each array. Because of this feature, it isn’t necessary
to pass both the array and its length to a method (as it is in some
other languages). However, it leads to people creating methods with
inflexible signatures. The subarray pattern is a way to remedy this
shortcoming.
Motivation
Most of the examples of arrays that you have seen probably involve iterating over all of the elements. However, there are many situations in which you only need to iterate over some of the elements in the array. Unfortunately, because you have only/mostly been exposed to examples in which this isn’t the case, you may have started to use a pattern that makes this difficult (and, hence, is sometimes called an anti-pattern).
Review
Suppose you were asked to write a method that is passed an array of
int
values and returns the total. You would probably use an
accumulator (see Chapter
16)
as in the following method:
public static int total(int[] data) {
int result;
result = 0;
for (int i = 0; i < data.length; ++i) {
result += data[i];
}
return result;
}
This method takes advantage of the fact that the number of iterations is
determinate (or definite) and uses a for
loop. This method also takes
advantage of the fact that the array has a length
attribute and, so,
does not include it in its signature.
The problem with this implementation is that it does not allow you to find the sum of a subset of the elements. For example, if the indexes represent months and the elements represent sales data, then you might want to find the total sales for only the second quarter (i.e., April, May, and June; 0-based months 3, 4, 5).
Thinking About The Problem
Obviously, what you need to do is add formal parameters to the method
that describe the subset of interest. You could, for example, pass
another array that contained the indexes to consider. Or, you could pass
a conformal boolean[]
array that contains true
for the elements to
use and false
for the elements to ignore. In practice, however, both
of these solutions are more complicated than is necessary because the
most common need is to iterate over a contiguous subset of the elements
(i.e., loosely speaking, a subset that contains no “gaps”, or a subset
in which the difference between two sequential is exactly 1).
The Pattern
The easiest way to solve this problem of iterating over a contiguous
subset of the elements is to add two formal parameters, the index to
start with and the size of the subset. Traditionally, the index to start
with is thought of as an offset from 0 and, hence, is named offset
.
The size of the subset is traditionally named length
.
The example of monthly sales data can be illustrated as in Figure
19.1. As is apparent from this illustration, the bounds
on the loop control variable are now going to be offset
and
offset + length
. As is also apparent from the illustration, you have
to be careful to use a strong inequality when comparing the loop control
variable to the upper bound (i.e., <
rather that <=
) to avoid an
off-by-one defect. So, the loop control variable will be initialized to
offset
and the iterations will continue as long as the loop control
variable is strictly less than offset + length
.
The one drawback of adding these parameters is that they must be
included in every invocation of the method. To avoid this, you can add
an overloaded version of the method that is only passed the array and
invokes the three-parameter version, passing it 0
for offset
and the
array’s length attribute for length
.
Examples
It’s trivially easy to create and find examples of this pattern.
The Motivating Example
Returning to the example above, the three-parameter version of the method is as follows:
public static double total(double[] data, int offset, int length) {
double result;
result = 0;
for (int i = offset; i < offset + length; ++i) {
result += data[i];
}
return result;
}
The one-parameter version then invokes the three-parameter version as follows:
public static double total(double[] data) {
return total(data, 0, data.length);
}
Examples in the Java Library
Examples of this pattern abound in the Java library. For example, this
pattern is used by the fill()
methods and the copyOfRange()
method
in the Arrays
class, and the arraycopy()
method in the System
class.
A Less Obvious Example
The same kind of thing can be done with rectangular arrays of arrays (sometimes called multidimensional arrays). In this case, the flexible method is as follows:
public static double total(double[][] data,
int roffset, int coffset,
int rlength, int clength) {
double result;
result = 0;
for (int r = roffset; r < roffset + rlength; ++r) {
for (int c = coffset; c < coffset + clength; ++c) {
result += data[r][c];
}
}
return result;
}
The one-parameter version then invokes the five-parameter version as follows:
public static double total(double[][] data) {
return total(data, 0, 0, data.length, data[0].length);
}
A Warning
It is possible for the invoker to pass an invalid offset
and/or
invalid length
. Hence, you should validate these parameters and
respond to invalid values appropriately.
There are two common responses to invalid values. One is to throw an
IllegalArgumentException
. The other is to use 0
for values of
offset
that are too small, the array’s length for values of offset
that are too large, and the array’s length for values of
offset + length
that are too large.