###### Books / Patterns for Beginning Programmers / Chapter 8

# Indicators

Many programs must perform calculations that vary based on conditions of one kind or another. There are many different ways to accomplish this but one very powerful (and common) solution is to use a multiplicative variable that takes on the value zero when the condition isn’t satisfied and the value one when it is.

## Motivation

Variables of this kind are common in many branches of mathematics and
are called *indicator variables*. They are often denoted using a
lowercase delta (i.e., \(\delta\)), often with a
subscript to denote the condition (e.g., \(\delta_s\) to
indicate whether a person smokes or not). Indicator variables are then
multiplied by other variables in more complex expressions.

For example, suppose you are writing a program to predict the birth weight of babies (in grams) from the gestation period (in weeks). You might theorize that the weight will be lower if the mother smokes during pregnancy. Ignoring whether the mother did or didn’t smoke, after collecting data from a (random or representative) sample of the population, you might determine a relationship like the following:

\[w = -2200 + 148.2 g\]where \(w\) (the *dependent variable*) denotes the birth
weight (in grams), and \(g\) (an *independent variable*)
denotes the gestation period (in weeks). Accounting for the mother’s
smoking behavior, you might determine that the birth weight was, on
average, 238.6 grams lower when the mother smoked. You now need to
decide how to account for this in your program.

## Thinking About The Problem

What you want to do is lower \(w\) when the mother smoked
and leave \(w\) unchanged when the mother didn’t smoke.
Since there are only two possible states (i.e., the mother smoked or
didn’t), you might be tempted to use a `boolean`

variable to keep track
of this information. However, it turns out that it is better to use a
discrete variable that is assigned either `0`

or `1`

, rather than one
that takes on the values `true`

or `false`

. The reason is that you can
use a numeric variable with the multiplication operator, and you can’t
do so with a `boolean`

variable. In other words, a `boolean`

variable
can’t be either the right-side or the left-side operand of the
multiplication operator.

In particular, suppose you add another independent variable, \(\delta_s\), and assign the value \(1\) to \(\delta_s\) if the mother smoked during pregnancy and assign the value \(0\) to \(\delta_s\) otherwise. Then, the equation for \(w\) can be expressed succinctly as follows:

\[w = -2200 + 148.2 g - 238.6 \delta_s\]In this way, \(w\) will be reduced by \(238.6\) when \(\delta_s\) is \(1\) and will be left unchanged when \(\delta_s\) is 0.

Note that you could define the indicator variable differently.
Specifically, you could assign \(1\) to
\(\delta_s\) if the mother **didn’t smoke** during
pregnancy and assign \(0\) to it otherwise. In this case,
the equation for \(w\) would be \(w = -2438.6 +
148.2 g + 238.6 \delta_s\) (i.e., the constant would change and
the sign of the last term would be reversed). The two indicators are
called *converses* of each other.

## The Pattern

In the simplest cases, all you need to do to use this pattern is to
define an `int`

variable, assign `0`

or `1`

to it as appropriate, and
then use it multiplicatively in an expression. In more complicated
cases, you may need multiple indicators, each with its own multiplier.

The converse indicator must take on the value `1`

when the original
indicator takes on the value `0`

, and vice versa. This can be
accomplished by subtracting the original indicator’s value from `1`

and
assigning the result to the converse indicator. In other words, the
converse indicator is simply 1 minus the original indicator. Which is
the “original” and which is the “converse” is completely arbitrary.

This idea can be expressed as the following pattern:

```
total = base + (indicator * adjustment);
```

with the converse indicator given by:

```
converse = 1 - indicator;
```

## Examples

Returning to the birth weight example, the code for calculating the weight can be implemented as follows:

```
w = -2200.0 + (148.2 * g) - (238.6 * delta_s);
```

where `w`

contains the weight, `g`

contains the gestation period, and
`delta_s`

contains the value `1`

if the mother smoked and `0`

otherwise.
Initializing `g`

to the average gestation period of `40.0`

weeks, you
could then use the statement to compare the birth weights for the two
possible values of `delta_s`

. A `delta_s`

of `0`

would result in a birth
weight of `3728.0`

while a `delta_s`

of `1`

would result in a birth
weight of `3489.4`

.

As another example, suppose the fine associated with a first parking
ticket is smaller than the fine associated with subsequent parking
tickets (specifically, $10.00 for the first ticket and $45.00 for
subsequent tickets). In this case, if you assign `0`

to
`ticketedIndicator`

when the person has no prior parking tickets and
assign `1`

to it otherwise, then you can write the statement to
calculate the fine as follows.

```
baseFine = 10.00;
repeatOffenderPenalty = 35.00;
totalFine = baseFine + (ticketedIndicator * repeatOffenderPenalty);
```

As a final example, consider a rental car company that charges a base
rate of $19.95 per day. There is a surcharge of $5.00 per day if
multiple people drive the car, and a surcharge of $10.00 per day if any
driver is under 25 years of age. If you assign `1`

to `multiIndicator`

when there are multiple drivers and you assign `1`

to `youngIndicator`

when there are any drivers under 25, then you can write the statement to
calculate the rate as follows:

```
baseRate = 19.95;
ageSurcharge = 10.00;
multiSurcharge = 5.00;
rate = baseRate + (multiIndicator * multiSurcharge)
+ (youngIndicator * ageSurcharge);
```

## Some Warnings

The descriptions of the examples in this chapter may have led you to use a different solution than the one discussed above. While you may, in the end, prefer such a solution, you should think carefully about the advantages and disadvantages before you make any decisions.

### Using `if`

Statements

You might be attempted to use a `boolean`

variable, `if`

statement, and
the updating pattern from Chapter
2
rather than an indicator, and there are times when this is appropriate.
However, in general, indicators are much less verbose.

For example, returning to the birth weight problem, if you assign `true`

to `smoker`

when the mother smoked during the pregnancy, then you can
calculate the birth weight as follows:

```
w = -2200.0 + (148.2 * g);
if (smoker) {
w -= 238.6;
}
```

This solution is much less concise than the solution that uses indicator variables. It also treats the continuous independent variable (\(g\) in this case) and the discrete independent variable (\(\delta_s\) in this case) differently, for no apparent reason.

This approach gets even more verbose as the number of discrete
independent variables increases. For example, returning to the car
rental problem, if you assign `true`

to `areMultipleDrivers`

when there
are multiple drivers and you assign `true`

to `areYoung`

when there are
any drivers under 25, then you can calculate the rental rate as follows:

```
baseRate = 19.95;
ageSurcharge = 10.00;
multiSurcharge = 5.00;
rate = baseRate;
if (areMultipleDrivers) {
rate += multiSurcharge;
}
if (areYoung) {
rate += ageSurcharge;
}
```

When using indicator variables, each additional discrete independent
variable only leads to an additional term in the single assignment
statement. When using `boolean`

variables, each additional discrete
independent variable leads to an additional `if`

statement.

### Using Ternary Operators

You might also be tempted to use a `boolean`

variable, the ternary
conditional operator, and the updating pattern from Chapter
2
rather than an indicator, but this is almost never appropriate. For
example, returning to the parking ticket problem, if you assign the
value `true`

to the `boolean`

variable `hasBeenTicketed`

when the person
has a previous ticket, then you can calculate the total fine as follows:

```
baseFine = 10.00;
repeatOffenderPenalty = 35.00;
totalFine = hasBeenTicketed ? baseFine + repeatOffenderPenalty : baseFine;
```

Some people do prefer this solution to the one that uses an `if`

for
stylistic reasons. That is, they think the ternary conditional operator
is more concise. However, it is not more concise than the solution that
uses an indicator variable, so it is hard to argue that it should be
preferred.

Further, when the number of discrete independent variables increases this approach gets much less concise. Returning to the car rental problem you could calculate the rental rate as follows:

```
baseRate = 19.95;
ageSurcharge = 10.00;
multiSurcharge = 5.00;
rate = areMultipleDrivers ? baseRate + multiSurcharge +
(areYoung ? ageSurcharge : 0.0) : baseRate +
(areYoung ? ageSurcharge : 0.0);
```

However, this statement is very verbose (and, many people think, difficult to understand).

You could, instead, calculate the rental rate as follows:

```
baseRate = 19.95;
ageSurcharge = 10.00;
multiSurcharge = 5.00;
rate = baseRate;
rate += areMultipleDrivers ? multiSurcharge : 0;
rate += areYoung ? ageSurcharge : 0;
```

Again, while some people may prefer this solution to the one that uses
`if`

statements because it is more concise, it is less concise than the
solution that uses indicator variables.