Delimiting Strings

Many programs, whether they have a textual user interface or a graphical user interface, need to combine an array of String objects into a single String object that has a delimiter between every pair of elements. There are several ways of accomplishing this goal.

Motivation

Suppose you want to generate the String object "Rain,Sleet,Snow" from a String[] containing the elements "Rain", "Sleet", and "Snow". One way to think about the desired result is that there is a comma (the delimiter) between every pair of elements. A second way to think about the desired result is that there is a comma after every item except the last one. A third way to think about the desired result is that there is a comma before every item except the first one. As it turns out, each leads to a different implementation.

Review

Since you are going to iterate over the String[] array and consider each element individually, the first conceptualization is a little awkward to deal with. Specifically, you will need to consider both element i and i-1 or i+1 at each iteration. If you work with index i-1 you will need to ensure that there are two elements and then initialize the loop control variable to 1. If you work with index i+1 you will need to ensure that there are two elements and then terminate the loop at the length of the array minus two. Neither is impossible, but both seem unnecessarily complicated if they can be avoided. Fortunately, both of the other conceptualizations only require you to work with one element at a time using an accumulator (as discussed in Chapter 16) so the first conceptualization won’t be considered.

Thinking About The Problem

The other two conceptualizations differ in that one appends the delimiter after concatenating an element to the accumulator, and the other prepends the delimiter before concatenating an element to the accumulator

Appending the Delimiter

The second conceptualization requires you to append the delimiter after every item except the last one. Assuming item contains the String[] and delim contains the delimiter, this can be implemented as follows:

        // Append the delimiter when needed
        result = "";        
        for (int i = 0; i < item.length; ++i) {
            result += item[i];
            if (i < item.length - 1) {
                result += delim;
            }
        }

It is also possible to treat an array of length 1 as a special case, initializing the accumulator accordingly, and then start with element 1 as in the following implementation:

        // Append the delimiter when needed, initializing
        // the accumulator based on the length
        if (item.length > 1) {
            result = item[0] + delim;
        } else if (item.length > 0) {
            result = item[0];
        } else {
            result = "";
        }
        
        for (int i = 1; i < item.length - 1; ++i) {
            result += item[i];
            result += delim;
        }

        if (item.length > 1) {
            result += item[item.length - 1];
        }

This eliminates the need for an if statement within the loop.

Prepending the Delimiter

The third conceptualization requires you to prepend the delimiter before every item except the first one. This can be implemented as follows:

        // Prepend the delimiter when needed
        result = "";        
        for (int i = 0; i < item.length; ++i) {
            if (i > 0) {
                result += delim;
            }
            result += item[i];
        }

Again, the if statement in the loop can be eliminated by treating element 0 as a special case, as follows:

        // Prepend the delimiter when needed, initializing
        // the accumulator based on the length
        if (item.length > 0) {
            result = item[0];
        } else {
            result = "";
        }
        
        for (int i = 1; i < item.length; ++i) {
            result += delim + item[i];
        }

The Pattern

At first glance, you might not prefer one solution to the other. However, if you consider a slight variant of the problem, your assessment might change. In particular, suppose you want to be able to use a different delimiter before the last element. Specifically, suppose you want to generate the String "Rain, Sleet and Snow". You now need to distinguish the “normal” delimiter (the comma and space) from the “last” delimited (the word “and” surrounded by spaces).

The append approach can be implemented with the if statement in the loop as follows:

        // Append the delimiter when needed
        result = "";        
        for (int i = 0; i < item.length; ++i) {
            result += item[i];
            if (i < item.length - 2) {
                result += delim;
            } else if (i == item.length - 2) {
                result += lastdelim;
            }
        }

and without the if statement in the loop as follows:

        // Append the delimiter when needed, initializing
        // the accumulator based on the length
        if (item.length > 2) {
            result = item[0] + delim;
        } else if (item.length > 1) {
            result = item[0] + lastdelim;
        } else if (item.length > 0) {
            result = item[0];
        } else {
            result = "";
        }
        
        for (int i = 1; i < item.length - 2; ++i) {
            result += item[i];
            result += delim;
        }

        if (item.length > 2) {
            result += item[item.length - 2] + lastdelim + item[item.length - 1];
        } else if (item.length > 1) {
            result += item[item.length - 1];
        }

The prepend approach can be implemented with the if statement in the loop as follows:

        // Prepend the delimiter when needed
        result = "";        
        for (int i = 0; i < item.length; ++i) {
            if (i > 0) {
                if (i < item.length - 1) {
                    result += delim;
                } else if (i == item.length - 1) {
                    result += lastdelim;
                }
            }
            result += item[i];
        }

and without the if statement in the loop as follows:

        // Prepend the delimiter when needed, initializing
        // the accumulator based on the length
        if (item.length > 0) {
            result = item[0];
        } else {
            result = "";
        }
        
        for (int i = 1; i < item.length - 1; ++i) {
            result += delim;
            result += item[i];
        }

        if (item.length > 1) {
            result += lastdelim + item[item.length - 1];
        }

Whether to have the if statement in the loop or not is subject to debate — the implementations that have the if statement inside of the loop seem more elegant but are less efficient than those that do not. Choosing between the two implementations that have if statements in the loop is easier. The prepend approach requires either nested if statements or a single if statement with multiple conditions, so the append approach is more elegant.

Choosing elegance over efficiency, this leads to a programming pattern that consists of two methods. One method has three parameters, the String[], the “normal” delimiter, and the “last” delimiter, and uses the append approach. The other method has two parameters, the String[] and the delimiter, and simply invokes the three-parameter version. These two methods can be implemented as follows:

    public static String toDelimitedString(String[] item, 
                                           String delim, String lastdelim) {
        String result;

        result = "";        
        for (int i = 0; i < item.length; ++i) {
            result += item[i];
            if (i  < item.length - 2) {
                result += delim;
            }  else if (i == item.length - 2) {
                result += lastdelim;
            }
        }
        return result;
    }

    public static String toDelimitedString(String[] item, String delim) {
        return toDelimitedString(item, delim, delim);
    }

Examples

Delimited strings get used in both the formatting of numerical data and the formatting of textual information. This pattern can easily be used for both.

One common way of formatting data is called comma separated values (CSV), which uses a comma as the delimiter between the different fields in a record. Two other common ways of formatting data are tab-delimited and space-delimited. Nothing special needs to be done to handle any of these schemes; simply use the two-parameter version of the method.

When formatting text, there are two common approaches. Both append a comma after every word but the penultimate and ultimate words. Both also append the word “and” after the penultimate word. They differ in whether or not the “and” is preceded by a comma. For example, some style guides use only the word “and” (as in “rain, sleet and snow”) while others use both a comma and the word “and” (as in “rain, sleet, and snow”). The last comma is commonly known as the Oxford comma. You could change the logic in the solution above to handle the Oxford comma by making the strong inequality a weak inequality and eliminating the else. However, it’s much better just to change the final delimiter from " and " to ", and ".



Licenses and Attributions


Speak Your Mind

-->