Strings in PHP

As we’ve previously seen, PHP has a built-in string type. Internally, PHP strings are simply a sequence of bytes, but for our purposes we can treat it as a 0-indexed character array. PHP strings are mutable and can be changed, but it is considered best practice to treat them as immutable and rely on the many functions PHP provides to manipulate strings.

Basics

We can create strings by assigning a string literal value to a variable. Strings can be specified by either single quotes or double quotes (there are no individual characters in PHP, only single character strings), but we will use the double quote syntax.

$firstName = "Thomas";
$lastName = "Waits";

//we can also reassign values
$firstName = "Tom";

The reassignment in the last line in the example effectively destroys the old string. The assignment operator can also be used to make copies of strings.

$firstName = "Thomas";
$alias = $firstName;

It is important to understand that this assignment makes a deep copy of the string. Changes to the first do not affect the second one. You can make changes to individual characters in a string by treating it like a zero-indexed array.

$a = "hello";
$a[0] = "H";
$a[5] = "!";
//a is now "Hello!"

The last line extends the string by adding an additional character. You can even remove characters by setting them to the empty string.

$a = "Apples!";
$a[5] = "";
//a is now "Apple!"

String Functions

PHP provides dozens of convenient functions that allow you to process and modify strings. We highlight a few of the more common ones here. A full list of supported functions can be found in standard documentation. Because of the history of PHP, many of the same functions defined in the C string library can also be used in PHP.

Length

When accessing individual characters in a string, it is necessary that we know the length of the string so that we do not access invalid characters (though doing so is not an error, it just results in null). The strlen() function returns an integer that represents the number of characters in the string.

$s = "Hello World!";
$x = strlen($s); //x is 12
$s = "";
$x = strlen($s); //x is 0

//careful:
$s = NULL
$x = strlen($s); //x is 0

As demonstrated in the last example, strlen() will return 0 for null strings. Recall that we can distinguish between these two situations by using is_null(). Using this function we can iterate over each individual character in a string.

$fullName = "Tom Waits";
for($i=0; $i<strlen($fullName); $i++) {
    printf("fullName[%d] = %s\n", $i, $fullName[$i]);
}

This would print the following:

fullName[0] = T
fullName[1] = o
fullName[2] = m
fullName[3] =
fullName[4] = W
fullName[5] = a
fullName[6] = i
fullName[7] = t
fullName[8] = s
Concatenation

PHP has a concatenation operator built into the language. To concatenate one or more strings together, you can use a simple period between them as the concatenation operator. Concatenation results in a new string.

$firstName = "Tom";
$lastName = "Waits";

$formattedName = $lastName . ", " . $firstName;
//formattedName now contains "Waits, Tom"

Concatenation also works with other variable types.

$x = 10;
$y = 3.14;

$s = "Hello, x is " . $x . " and y = " . $y;
//s contains "Hello, x is 10 and y = 3.14"
Computing a Substring

PHP provides a simple function, substr() to compute a substring of a string. It takes at at least 2 arguments: the string to operate on and the starting index. There is a third, optional parameter that allows you to specify the length of the resulting substring.

$name = "Thomas Alan Waits";

$firstName = substr($name, 0, 6); //"Thomas"
$middleName = substr($name, 7, 4); //"Alan"
$lastName = substr($name, 12); //"Waits"

In the final example, omitting the optional length parameter results in the entire remainder of the string being returned as the substring.

Arrays of Strings

We often need to deal with collections of strings. In PHP we can define arrays of strings. Indeed, we’ve seen arrays of strings before. When processing command line arguments, PHP defines an array of strings, $argv. Each string can be accessed using an index, $argv[0] for example is always the name of the script.

We can create our own arrays of strings using the same syntax as with other arrays.

$names = array(
    "Margaret Hamilton",
    "Ada Lovelace",
    "Grace Hopper",
    "Marie Curie",
    "Hedy Lamarr");

String Comparisons

When comparing strings in PHP, we can use the usual comparison operators such as ===, <, or <= which will compare the strings lexicographically. However, this is generally discouraged because of type juggling issues and strict vs loose equality/inequality comparisons. Instead, there are several comparator methods that PHP provides to compare strings based on their content. strcmp($a, $b) takes two strings and returns an integer based on the lexicographic ordering of $a and $b. If $a precedes $b, strcmp() returns something negative. It returns zero if $a and $b have the same content. Otherwise it returns something positive if $b precedes $a.

$x = strcmp("apple", "banana"); //x is negative
$x = strcmp("zelda", "mario"); //x is positive
$x = strcmp("Hello", "Hello"); //x is zero

//shorter strings precede longer strings:
$x = strcmp("apple", "apples"); //x is negative

$x = strcmp("Apple", "apple"); //x is negative

In the last example, "Apple" precedes "apple" since uppercase letters are ordered before lowercase letters according to the ASCII table. We can also make case insensitive comparisons if we need to using the alternative, strcasecmp($a, $b). Here, strcasecmp("Apple", "apple") will return zero as the two strings are the same ignoring the cases.

The comparison functions also have length-limited versions, strncmp($a, $b, $n) and strncasecmp($a, $b, $n). Both will only make comparisons in the first $n characters of the strings. Thus, strncmp("apple", "apples", 5) will result in zero as the two strings are equal in the first 5 characters.

Splitting a String in PHP

Tokenizing is the process of splitting up a string along some delimiter. For example, the comma delimited string, "Smith,Joe,12345678,1985-09-08" contains four pieces of data delimited by a comma. Our aim is to split this string up into four separate strings so that we can process each one. PHP provides several functions to to this, explode() and preg_split().

The simpler one, explode() takes two arguments: the first one is a string delimiter and the second is the string to be processed. It then returns an array of strings.

$data = "Smith,Joe,12345678,1985-09-08";

$tokens = explode(",", $data);
//tokens is [ "Smith", "Joe", "12345678", "1985-09-08" ]

$dateTokens = explode("-", $tokens[3]);
//dateTokens is now [ "1985", "09", "08" ]

The more sophisticated preg_split() also takes two arguments,1 but instead of a simple delimiter, it uses a regular expression; a sequence of characters that define a search pattern in which special characters can be used to define complex patterns. For example, the complex expression ^[+-]?(\d+(\.\d+)?|\.\d+)([eE][+-]?\d+)?$ will match any valid numerical value including scientific notation. We will not cover regular expressions in depth, but to demonstrate their usefulness, here’s an example by which you can split a string along any and all whitespace:

$s = "Alpha Beta \t Gamma \n Delta \t\nEpsilon";
$tokens = preg_split("/[\s]+/", $s);
//tokens is now [ "Alpha", "Beta", "Gamma", "Delta", "Epsilon" ]

Licenses and Attributions


Speak Your Mind

-->