Books / Ruby for Beginners / Chapter 20

Arrays of Arrays (two-dimensional arrays in Ruby)

2-d Arrays in Ruby

We can specify any type while initializing arrays. For example, String:

$ irb
> Array.new(10, 'hello')
=> ["hello", "hello", "hello", "hello", "hello", "hello", "hello", "hello", "hello", "hello"]

Or Boolean (this type doesn’t exist, and in this book we intentionally refer to both types TrueClass and FalseClass as Boolean):

$ irb
> Array.new(10, true)
=> [true, true, true, true, true, true, true, true, true, true]

Or Integer:

$ irb
> Array.new(10, 123)
=> [123, 123, 123, 123, 123, 123, 123, 123, 123, 123]

In other words, element in array is arbitrary object. If element is object and array is object too, we can define array of arrays:

$ irb
> Array.new(10, [])
 => [[], [], [], [], [], [], [], [], [], []]

If we access this array by index, we’ll reach an array inside of root array. For example, index with the value of 4 can be used to access fifth element. Let’s try it in REPL:

$ irb
> arr = Array.new(10, [])
 => [[], [], [], [], [], [], [], [], [], []]
> element = arr[4]
 => []
> element.class
 => Array

You can see that we’re checking element’s class by .class, and while accessing element REPL shows us the value (=> [] line) and it’s empty ([]). What one can do with empty array? For example, add something:

element.push('something')

And what do we expect? Let’s sum up what’s been said in this chapter:

  • We defined array of arrays with the size of 10: arr = Array.new(10, [])
  • This array looks like this: [[], [], [], [], [], [], [], [], [], []]
  • We get the firth element: element = arr[4]
  • And we add value to this array: element.push('something')

Since element is array and we add something, it will look like array with the value inside:

['something']

And now we expect the arr (array of array) to look like:

[[], [], [], [], ['something'], [], [], [], [], []]

Let’s check in REPL:

> arr
=> [["something"], ["something"], ["something"], ["something"], ["something"], ["something"], ["something"], ["something"], ["something"], ["something"]]

Oh no! Something’s not right! Here is the program:

arr = Array.new(10, [])
element = arr[4]
element.push('something')
puts arr.inspect # the way to print information similar to REPL

Where is mistake? If you’re programmer converting from another language, it’s worth making a break here and think about what could go wrong. This one can be also tricky interview question.

The answer isn’t obvious, and you need to have understanding of how Ruby language works, what is object, and what is reference (or pointer). Do you remember we covered this topic a little bit?

…apartments house with multiple doorbells. New variable is similar to a doorbell that leads to this or another apartment. Doorbell is not apartment itself, but it’s associated with it.

We can also reproduce this issue with String class:

arr = Array.new(10, 'something')
element = arr[4]
element.upcase!
puts arr.inspect

Expected result:

["something", "something", "something", "something", "SOMETHING", "something", "something", "something", "something", "something"]

Actual result:

["SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING"]

What’s going on here? We’re modifying only one element, and all element of array get updated. Actually, it doesn’t matter which element we’re changing, fifth, or second, or last. Result is always the same. Try it yourself!

The answer to this puzzle is reference. When initializing array we’re passing the reference to single object:

arr = Array.new(10, 'something')

'something' above is String object (everything is object in Ruby). Since we’re passing the reference to this single object, array gets initialized with 10 cells that have exactly the same reference value! In other words, there is no object inside a cell, there is reference to object.

To avoid this side effect we needs these references to be different, so they point to different objects. And these objects will be placed in different locations in computer memory, but technically they will look the same.

It’s like having exactly the same type of beer in your six-pack: all bottles look the same, but they’re all different. If we change the state of one bottle, it won’t affect the state of other bottles.

With example of apartments house with multiple doorbells imagine the following scenario. We brought a box (array) and want to put 10 doorbells inside of that box. We did that, but all the wires lead to only one specific apartment. It doesn’t matter which doorbell we use, we’ll get the answer from the same tenants.

If we want to fix that we need these wires to lead to different apartments. So always avoid code like this one, it’s wrong:

arr = Array.new(10, []) # <-- WRONG!

Only because array inside is supposed to change its state. Why we would need empty array? There is no any sense to that, because one day we’ll want to add something to empty array, this is exactly what arrays were created for. But with strings things are actually easier, the following code is totally legit:

arr = Array.new(10, 'something')

But with one caveat: we are not going to use “dangerous” operation on String (or any other type). Dangerous operation is something that changes the state of an object, and usually these methods have exclamation mark at the end, for example: 'something'.upcase!. Do you understand why these methods were called “dangerous”?

And we’re safe to define arrays with numbers:

arr = Array.new(10, 123)

There are no any dangerous methods on Integer class, even if you can access array, you can’t modify it, you can’t change its state. You will only be able to replace one object with another, but previous object won’t disappear. It will remain in computer memory for a while, until garbage collector find it.

So if you type arr[4] = 124 you’ll replace the reference in array to another reference leading to new object (124). And all other references to previous “123“-object will remain untouched.

With numbers we’re getting what we expect:

$ irb
> arr = Array.new(10, 123)
 => [123, 123, 123, 123, 123, 123, 123, 123, 123, 123]
> arr[4] = 124
 => 124
> arr
 => [123, 123, 123, 123, 124, 123, 123, 123, 123, 123]

It’s okay if these details look complicated, because they are. Good news is while doing your everyday job you usually don’t deal with these complexities too often. You need only basic understanding and when the moment comes you’ll remember that it could be it, and you will do your search over Internet.

Probably some experienced programmers won’t like this approach, and you’ll hear advice to learn this and that, before you start doing something and move forward. But the experience of Ruby School students shows that moving fast is a good way to go; if you don’t understand something, skip it and move on. You’ll get back to the part you don’t understand later, and often it’s more important to spend time on looking for your first software development job, rather than polishing theoretical knowledge.

But let’s get back to the beginning, how do we define two-dimensional array? Imagine we’re programming “Sea battle” game and we need 10 by 10 array, with 10 rows, where each row contains 10 columns. How do we define array where each element is going to be the reference to new and unique object?

Let’s see how we can define arrays in C#:

var arr = new int[10, 10];

For the type of String:

var arr = new string[10, 10];
arr[9, 9] = "something";

For some unknown reason syntax in Ruby and JavaScript looks little bit more complicated. Below is how you define two-dimensional 10 by 10 array in Ruby (empty cells will be filled with nil value):

arr = Array.new(10) { Array.new(10) }

Wow, but why it looks so magic? Let’s dive a little bit deeper into this line. “new” method accepts one parameter and one block. First parameter is fixed, it’s the size of array. Second parameter is actually block which is going to be executed while initializing every individual element. Result of this execution is going to be new element. Block will be executed 10 times in our case. Here is how you can use block with String:

arr = Array.new(10) { 'something' }

Result is similar to the result of the following code:

arr = Array.new(10, 'something')

And it looks the same when you execute these two lines in REPL:

$ irb
> arr1 = Array.new(10) { 'something' }
 => ["something", "something", "something", "something", "something", "something", "something", "something", "something", "something"]

> arr2 = Array.new(10, 'something')
 => ["something", "something", "something", "something", "something", "something", "something", "something", "something", "something"]

But there is one subtle difference. While initializing, the first statement calls block. Every time it gets called we have new instance of String class with the value “something” in computer memory.

The second statement (when we define arr2 variable) Ruby takes already initialized “something” that we’re passing as second parameter. It gets created in memory before it gets passed to Array.new, and reference to this single instance is used for all cells of array.

Proof is quite easy. For folks who aren’t familiar with Ruby too much it looks like a magic trick. Modify element by index 0 in first array where cell has reference to its own object (try steps below in your REPL):

arr1[0].upcase!

Result of arr1:

> arr1
 => ["SOMETHING", "something", "something", "something", "something", "something", "something", "something", "something", "something"]

As you can see, only first element was changed, and it’s the proof that every cell has reference to its own object for first array. Now let’s follow these steps for the second array:

> arr2[0].upcase!
 => "SOMETHING"
> arr2
 => ["SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING"]

Every element in array was changed, because cells in second array have reference to the same element.

Can you guess how this program would work if before “arr2[0].upcase!” we’d reinitialize, let’ say, fifth element?

> arr2[4] = 'something' # <-- REINITIALIZING FIFTH ELEMENT HERE
 => "something"
> arr2[0].upcase! # <-- CHANGE THE STATE OF OBJECT BY INDEX ZERO
 => "SOMETHING"
> arr2
 => ["SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "something", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING", "SOMETHING"]

That’s right, each cells have reference to updated with “upcase!” element, except fifth one. Fifth element is different object because it was reinitialized. That’s why array of arrays must be defined as follows:

arr = Array.new(10) { Array.new(10) }

If we want to fill array of arrays with some value (by default it’s nil), we must pass it to the second constructor (“new” part of class initialization called “constructor”, to be covered later in this book):

arr = Array.new(10) { Array.new(10, 123) }

This is how you define 10 by 10 two-dimensional array and initialize every cell with 0:

arr = Array.new(10) { Array.new(10, 0) }

Define 2D array with 4 rows and 10 columns and initialize it with “0”:

arr = Array.new(4) { Array.new(10, 0) }

Define 2D array with 2 rows and 3 columns and initialize with “something”:

arr = Array.new(2) { Array.new(3, 'something') }

Define 2D array with 3 rows and 2 columns and initialize with “something”:

arr = Array.new(3) { Array.new(2, 'something') }

Hopefully, initializing two-dimensional arrays makes more sense now. We have understanding of what array is, and let’s see how we can initialize arrays with predefined values. One-dimensional arrays (or just “arrays”) is quite easy to initialize with whatever you want. For example:

arr = [1, 2, 3]

Or:

arr = ['one', 'two', 'three']

Every array has objects (technically, “every array holds references to objects”). Two-dimensional array is the same array with only caveat that array has objects of type Array.

Pattern to define array of three strings, for example:

arr = [..., ..., ...]

But if you’re looking to define array of empty arrays, use [] instead of “...”:

arr = [[], [], []]

Tic-tac-toe game is a good example of array of arrays. For the following board let’s assume that “O” is represented by “0”, and “X” by “1”, empty cell is nil:

Tic Tac Toe

This is how this array would look like in Ruby:

arr = [[0, 0, 1], [nil, 0, nil], [1, nil, 1]]

Exactly the same expression looks more clear with multiple lines:

arr = [
  [0, 0, 1],
  [nil, 0, nil],
  [1, nil, 1]
]

Spaces or empty lines won’t affect execution of this program, so you can beautify it more if you really want.

Exercise 1 Try in REPL everything written above, run every example and make sure you understand concepts explained in this chapter (it’s okay if you don’t understand all of them, make a note and move on, you can come back to this chapter later).

Exercise 2 Create 5 by 4 array (5 rows and 4 columns), fill every cell of a row with random number from 1 to 5 (one random number per row). Example for 2 by 3 array:

[
  [2, 2, 2],
  [5, 5, 5]
]

Exercise 3 Do the same exercise, but for 4 by 5 array.

Exercise 4 Create new 5 by 4 array and fill every cell with random numbers from 0 to 9.

Accessing Array of Arrays

There is a trick while accessing 2D, two-dimensional, or array of arrays. When accessing this type of array you need to access row first, and column next. From previous chapters we know how to access one-dimensional array:

puts arr[4]

Or if we want to change the value:

arr[4] = 123

Where 4 is index. With 2D arrays we need to use double square brackets. For example, the following code will change the value of the cell in fifth row and ninth column:

arr[4][8] = 123

For a long time in the school we were told that in math coordinates of some area are usually defined as (x, y). Where “x” is horizontal axis (column), and “y” is vertical one (row). But accessing arrays is little bit confusing, because we need to access row first, and column next, in other words by using (y, x).

To get a better feeling of how it works we can break down this expression into multiple lines:

row = arr[4] # Get the entire array on fifth row into variable
row[8] = 123 # Change the ninth cell to 123

The way to print the value of fifth row and ninth column:

row = arr[4] # Get entire row into variable
column = row[8] # Get the value of cell into another variable
puts column # Print this variable

However, programmers usually use more compact notation: arr[4][8]

Depending on the type of work you’re doing, different naming convention can be used for rows and columns. Let’s look at the most common examples:

  • row, column. Accessing array: arr[row][column].
  • y - row, x - column. Accessing array: arr[y][x]
  • j - row, i - column. Accessing array: arr[j][i]

You absolutely need to remember that you need to access row, y, j first when dealing with 2D arrays. This is the most common pitfall of every beginner while doing interviews.

Note that for index we use variable with the name “i”. If there is more than one index, use next letters from English alphabet: j, k, and so on. You don’t have to do that, and have freedom to name variables as you want. However, sometimes there are naming conventions, and if you follow them, other programmers will understand your code better.

Let’s create new two-dimensional array and try to traverse it. Traversal of all kinds of arrays and other data structures is a very common task, make a note for yourself to practice a lot before your next interview.

arr = [
  %w(a b c),
  %w(d e f),
  %w(g h i)  
]

0.upto(2) do |j|
  0.upto(2) do |i|
    print arr[j][i]
  end
end

Result:

abcdefghi

At the program above (lines 7-11) we see two nested blocks, or one loop inside another (inner loop). But how does it work? We already know that first loop (with “j” variable) just “goes over” array. It doesn’t know that we have array of arrays. We can rewrite the following lines to demonstrate that:

0.upto(2) do |j|
  puts arr[j]
end

So this loop goes over three elements, but each element is array (but loop isn’t aware of that). It will print 3 lines:

['a', 'b', 'c']
['d', 'e', 'f']
['g', 'h', 'i']

Since each element is array, we have a right to iterate over it one more time, as we already did it before many times. Program above can be rewritten with using Array#each method:

arr = [
  %w(a b c),
  %w(d e f),
  %w(g h i)  
]

arr.each do |row|
  row.each do |value|
    print value
  end
end

Which is actually more preferred way to go. Can you guess why? Lines 7-11 do not rely on the size of array, and if you add more letters to your initial array, program will work correctly.

There is a way to rewrite this array definition without using “%w” helper, but readability in this case will suffer a little bit:

arr = [
  ['a', 'b', 'c'],
  ['d', 'e', 'f'],
  ['g', 'h', 'i']  
]

Exercise 1 Traverse 3x3 array defined above manually, without loops, criss-cross, so it prints “aeiceg”.

Exercise 2 In REPL create two-dimensional 3 by 3 array where each element has the value of string “something”. Define this array the way so every element is protected from dangerous operations. For example, “arr[2][2].upcase!” statement should modify only on cell and won’t affect others.

Exercise 3 John Smith has a business where they have large pool of phone numbers, and they sell these phone numbers for advertisements. They want to sign a contract with you, but before they want to ensure you can follow their requirements and capable of delivering quality software. They say: we have phone numbers with letters, like 555-MATRESS. When customers type this phone number on phone keyboard, hey reach “555-628-7377”. Write a method in Ruby language that will translate phone numbers without dashes, like “555MATRESS” into phone numbers with digits only, like 5556287377. Method signature:

def phone_to_number(phone)
  # code here...
end

puts phone_to_number('555MATRESS') # should print 5556287377

Sample image of phone keyboard:

Phone keyboard

Multi-dimensional Arrays

We are already familiar with two types of arrays: 1 and 2 dimensional. However, there can be multiple dimensions, so it’s “array of array of array”. Sometimes they call it “tensor”. For example, popular framework for machine learning is called TensorFlow, sounds like something flows over multi-dimensional array and changes its inner data. However, machine learning is topic for another book (and it’s almost never done in Ruby, mostly Python, C++ and some other languages).

Here is an example of 3-dimensional array in Ruby:

arr = [
  [
    %w(a b c),
    %w(d e f),
    %w(g h i)	
  ],
  [
    %w(aa bb cc),
    %w(dd ee ff),
    %w(gg hh ii)  
  ]
]

It’s array of 2 by 3 by 3: two blocks, each block has 3 rows, where each row has 3 columns.

Dimension of array is just its property, if you deal with these types of arrays you need to know how to access them. For example, you can reach element “f” with “arr[0][1][2]” statement.

You probably won’t use three or more dimensional arrays in your everyday job too much, but arrays are often combined with another data structure - Hash (with eponymous type in Ruby). Multi-dimensional arrays require the knowledge of their dimensions (depth), and if we add one row or column at the very beginning, we must update indexes everywhere in our program. Not very convenient. Moreover, every update is a risk to introduce new error.

But mixing arrays with hashes is very powerful and wide-spread technique. It’s also known as JSON (JavaScript Object Notation). This name sounds a little bit confusing, because JSONs are everywhere, not only in JavaScript: in Ruby, Java, Python, literally in every programming language. And a programmer doesn’t need to know index (some number) to access specific row or column in JSON, because JSON objects can be accessed via keys, where key is some strings, like “date_of_birth”. Much easier to remember.

We’ll cover JSON objects later in this book.

Exercise 1 Define array outlined above in your REPL and try to read and write cell with “ee” value.

Exercise 2 Open up official docs for Array class at http://ruby-doc.org/core-2.5.1/Array.html to see if you can understand some of the methods explained in documentation, and maybe learn something new!

Summary of Array class

Array is essential data structure. Every programmer should know how to effectively query and update arrays. Ruby offers variety of methods to simplify array operations, like lookups, updates, calculating number of matches based on criteria, adding, removing, bulk operations over all elements, and so on. Ruby’s standard library is very powerful, predictable, and straightforward; programmers love to use it. We hope you’ll really enjoy manipulating arrays with Ruby. Let’s get started!


Licenses and Attributions


Speak Your Mind

-->