# Minimum Spanning Trees

A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. That is, it is a spanning tree whose sum of edge weights is as small as possible. One example use-case is a telecommunications company trying to lay cable in a new neighborhood. If it is constrained to bury the cable only along certain paths (e.g. roads), then there would be a graph containing the points (e.g. houses) connected by those paths. Some of the paths might be more expensive, because they are longer, or require the cable to be buried deeper; these paths would be represented by edges with larger weights.

A spanning tree for an undirected graph G = (V, E) is a graph G’ = (V, E’) such that G’ is a tree.

In other words, G’ has the same set of vertices as G, but edges have been removed from E so that the resulting graph is a tree. This amounts to saying that G’ is acyclic. It is called a spanning tree because every node is connected to every other node, so the tree spans all of the nodes. Removal of any single edge from a spanning tree causes the graph to be unconnected. A spanning tree is minimum if there is no other spanning tree with smaller cost. If the graph is unweighted, then the cost is just the number of edges. If it has weighted edges, then the cost is the sum of the edge weights of the edges in the spanning tree. See Figure 1 below for an example.

An example of an application of spanning trees is for finding the least length of wiring that can be used to wire the electrical connections in a building.

The problem of finding a minimum spanning tree amounts to finding the set of edges of which this tree is composed. After all, the spanning tree is the same vertex set as the graph, but a subset of the edges. So the idea is to design an algorithm that selects which edges belong in the tree. A relatively simple algorithm for finding a minimum spanning tree is Kruskal’s Algorithm. It is a greedy algorithm in that it always tries to find the best solution at each step. Not all greedy approaches work. Here it does.

### Kruskal’s Algorithm

The algorithm starts out with a forest in which each vertex is a tree by itself. Then it looks for the edge with least weight and connects its two vertices into a tree with two vertices. If there is more than one such edge, any of them can be chosen. It then looks for another edge that is least weight among those that remain and connects the two vertices together if they are not already part of the same tree. If they are in the same tree, then adding this edge would form a cycle, so the edge is not added. The algorithm continues this procedure until there is just one tree, i.e., every vertex is in a single tree. This tree will be minimum cost because the edges were added in order of ascending cost.

The algorithm represents the trees as sets of vertices. To form a new tree from two existing trees, a union operation is applied to the two sets that represent these trees. When given an edge (v,w) in the graph, the algorithm uses the find operation to find in which sets v and w each belong.

Therefore, a set data type is used to represent the sets of vertices. A heap is used for picking the minimum weight edge at each step. The algorithm is presented in pseudocode in the following listing.

void kruskal() {
Vertex u, v; // Vertex is the type of vertex labels
Edge e; // Edge is a struct containing (source_vertex, target_vertex, weight)
DisjointSet s(NUM_VERTICES); // the set of trees
PriorityQueue < Edge > h(NUM_EDGES); // the edges , with smallest weight having highest priority
SetType uset, vset; // SetType is the range of indices 0..NUM_VERTICES
list < Edge > edgelist; // list of edges in the minimum spanning tree

readGraphIntoHeapArray(h); // read the edge set into the array that gets heapified
h.buildHeap(); // now heapify based on edge weights

int edgesAccepted = 0;
while (edgesAccepted < NUM_VERTICES - 1) {
e = h.deleteMin(); // assume e = (u , v )
uset = s.find(u);
vset = s.find(v);
if (uset != vset) {
// add the edge to the list of edges in the tree and increment
count
edgelist.pushback(e);
edgesAccepted++;
s.union(uset, vset);
}
}
}


#### Performance Analysis

The algorithm takes O(|E|) steps to build the heap. Since the while-loop iterates in the worst case once for each edge in the graph, it can take O(|E|) deleteMin, find, and union operations to build the minimum spanning tree. Each deleteMin takes log |E| steps, so this is O(|E| log |E|) steps. The find and union operations take less time than the deleteMin, so they do not increase the amortized running time. Since |E| = O(|$$V^{2}$$|) and log(|$$V^{2}$$|) ∈ O(log |V|), this is O(|E| log |V|).