###### Books / Patterns for Beginning Programmers / Chapter 19

# Interval Membership

There are many situations in which categories are defined by intervals and, as a result, it is necessary to find the interval that contains a particular value. This chapter considers one specific way to organize the data and perform such a search.

#### In this chapter

## Motivation

The U.S. tax code defines a group of intervals called *tax brackets*,
each of which has an associated *marginal tax rate*. In 2017, these tax
brackets (for single taxpayers) were defined as in Table
16.1.
For example, a person earning $23,000 would be in the 15% marginal
bracket (i.e., would pay 10.0% on the first $9,325, and 15.0% on
everything over $9,325).

Our objective in this chapter is to organize the information in Table 16.1 in such a way that it is easy to find the marginal tax rate for any income.

\[\begin{aligned} &\begin{array}{llc} \hline \text { From } & \text { To } & \text { Rate } \\ 0 & 9,325 & 10.0 \\ 9,326 & 37,950 & 15.0 \\ 37,951 & 91,900 & 25.0 \\ 91,901 & 191,650 & 28.0 \\ 191,651 & 416,700 & 33.0 \\ 416,701 & 418,400 & 35.0 \\ 418,401 & \text { Above } & 39.6 \\ \hline \end{array}\\ &\text { Table 16.1. U.S. Tax Brackets for Single Taxpayers in } 2017 \end{aligned}\]## Review

Though Chapter
18
on lookup arrays used different terminology, some of the examples were
closely related to the problem considered here. For example, consider
the problem of finding the letter grade that corresponds to a particular
numerical grade. Essentially, one needs to find the interval that
contains the numerical grade. However, since all of the intervals are
the same size, there is no reason to explicitly list the intervals.
Hence, the code in Chapter
18
converts an interval like \(\[90, 99\]\) to the index
\(9\). In this chapter, the situations involve intervals
which are *irregular*, and a different solution is required.

## Thinking About The Problem

The tax table has two convenient properties (that are common to a wide
variety of similar situations). First, the union of all of the intervals
is the relevant subset of the real numbers (in this case, the
non-negative reals). In other words, the tax table contains the marginal
tax rate for every possible income. Second, the intervals are
*disjoint*. That is, the intersection of any two intervals is the empty
set.

Hence, each income has a unique marginal tax rate that can be determined using only a sequence of boundary values, \(b\_{0}, b\_{1}, \ldots, b\_{n-1}\). Specifically, assuming you want to “cover” the entire set of real numbers, interval \(0\) can be defined as \(\[-\infty, b_{0})\), interval \(1\) can be defined as \(\[b_{0}, b\_{1})\), interval \(2\) can be defined as \(\[b_{1}, b_{2})\), interval \(n-1\) can be defined as \(\[b_{n-2}, b_{n-1})\), and interval \(n\) can be defined as \(\[b_{n-1}, \infty)\). For the tax example, the boundaries for single taxpayers are \(0\), \(9326\), \(37951\), \(91901\), \(191651\), \(416701\), \(418401\), making the intervals \(\[-\infty, 0)\), \(\[0, 9326)\), \(\[9326, 37951)\), \(\[37951, 91901)\), \(\[91901, 191651)\), \(\[191651, 416701)\), \(\[416701, 418401)\), \(\[418401, \infty)\).

Since the boundaries are homogeneous (i.e., they are values of the same
type) they can be stored in a single array. For example, the boundaries
for single taxpayers can be stored in a `int[]`

named `single`

as
follows:

```
int[] single = {0, 9326, 37951, 91901, 191651, 416701, 418401};
```

Given this representation of the intervals, you could search through them as follows:

```
public static int indexOf(int value, int[] boundary) {
int i, n;
n = boundary.length;
for (i = 0; i < n-1; ++i) {
if ((value >= boundary[i]) && (value < boundary[i + 1])) {
return i + 1;
}
}
return n;
}
```

## The Pattern

While the implementation above is fine, it has a couple of shortcomings.
First, it uses a `for`

loop, which might lead someone reading the code
to think that the loop is determinate (or definite) when, in fact, it
isn’t. That is, someone reading the code might assume that there are
always exactly `n-1`

iterations when there can be fewer. Second, the
containment condition checks to see if the target value is greater than
or equal to the left boundary and less then the right boundary at every
iteration. However, both checks aren’t really necessary since the right
boundary of interval \(n-1\) is the same as the left
boundary of interval \(n\).

The first shortcoming can be corrected by using a `while`

loop. The
second shortcoming can be corrected by continuing to loop as long as the
target value is greater than or equal to the right boundary (meaning
that the correct interval has not yet been found). Combining the two
ideas leads to the following pattern:

```
public static int indexOf(int value, int[] boundary) {
int i, n;
n = boundary.length;
i = 0;
while ((i < n) && (value >= boundary[i])) ++i;
return i;
}
```

This algorithm increases the index as long as there are more intervals to check and the target value is greater than or equal to the right boundary.

## Examples

Continuing with the tax example, you can now find the tax bracket for a particular income level as follows:

```
int[] single = {0, 9326, 37951, 91901, 191651, 416701, 418401};
int bracket;
bracket = indexOf(125350, single);
```

You can then use a lookup-array (see Chapter 18) to find the marginal tax rate that corresponds to that tax bracket as follows:

```
double[] rate = {-1.0, 10.0, 15.0, 25.0, 28.0, 33.0, 35.0, 39.6};
double marginal;
marginal = rate[bracket];
```

## Some Warnings

The obvious thing to be careful about when using this pattern is which
side of the interval is open and which side is closed. Making mistakes
here can cause *off-by-one defects* that are difficult to find.

The other thing to be careful about is much less obvious. One might be tempted to combine the interval membership functionality and the look-up array functionality in a single method. For example, in the tax rate example, one might be tempted to do the following:

```
public static double taxRate(int income) {
int[] single = {0, 9326, 37951, 91901, 191651, 416701, 418401};
double[] rate = {-1.0, 10.0, 15.0, 25.0, 28.0, 33.0, 35.0, 39.6};
return rate[indexOf(income, single)];
}
```

The drawback of this seemingly elegant idea is that one often wants to perform more than one lookup with the same index.

Perhaps the easiest way to see this is with the grade example from
Chapter
18
on lookup arrays. One commonly wants to convert a numeric grade on a
0–100 scale to **both** a letter grade on an F–A scale and a numeric
grade on a 0–4 scale. Hence, one wants to do one interval membership
search and two array look-ups. This can be accomplished as follows:

```
int[] intervals = {0, 60, 63, 67, 70, 73, 77, 80, 83, 87, 90, 93};
double[] gp = { -1.0, 0.0, 0.7, 1.0, 1.3, 1.7, 2.0, 2.3, 2.7, 3.0, 3.3, 3.7, 4.0};
String[] letter = { "NA","F","D-","D","D+","C-","C","C+","B-","B","B+","A-","A"};
int i;
String out;
i = indexOf(88, intervals); o
ut = String.format("Grade: %s (%3.1f)", letter[i], gp[i]);
```

There’s no reason to do the interval membership search separately for
each of the two look-ups. Hence, it is better to have a separate
`indexOf()`

method.

## Looking Ahead

In some situations, the intervals don’t cover all the real numbers
(i.e., there are gaps). In such situations, the value might not be in
any interval. One way to handle this (and other situations) is to use
*conformal arrays* as discussed in Chapter
20
— use one array to hold the left boundaries and another to hold the
right boundaries.