Spanning Tree is a tree that has all vertices from an original graph and has a minimum number of edges.
That is, a spanning tree has $n$ vertices and $n-1$ edges if the initial graph has $n$ vertices.
The features of the spanning tree are below:
- All vertices are included and there is no cycle. (Because it is a tree!)
- It has exactly $n-1$ edges. (Because it has a minimum number of edges!)
- Multiple spanning tree can exist from one graph.
We can say that a spanning tree can be made by selecting edges which are visited while searching with DFS or BFS.
This is because we set these searching algorithm not to visit the spot which we already visited.
Minimum Spanning Tree is a spanning tree whose sum of edges’ weight is minimum.
This can be applied to various real situations such as the network deployment, the road construction, etc. to find the case with minimum cost/time.
There are two major algorithms to find MST, which are Kruskal’s Algorithm and Prim’s Algorithm.
Both are based on the greedy method.
We’re going to see the details of each algorithm.
First, the Greedy Method is an algorithm methodology which selects an optimal option every moment and finds the final optimal solution in the end.
To use this, the fact that this algorithm with the greedy method eventually gives the desired output even if we choose an option only based on each specific moment must be proved.
Luckily, Kruskal’s algorithm and Prim’s algorithm are already proved to be true, so we can use them.
And there are some cases that we should check if a selection makes a cycle when we implement algorithms.
We use Union-Find Algorithm for this.
With it, we can determine whether a vertex we are going to choose is included with other vertices that are already in MST now.
This finding function takes $O(V)$ if it is the worst case, but by compacting the tree while executing this function we can reduce the time complexity with the amortized analysis.
We cannot say that it is perfect $O(1)$, but at least the running time can be reduced to a substantially small value within the system of numbers we’re dealing with.
I’ll post about this algorithm later.
Kruskal’s algorithm
- Sort all edges in ascending order by weight.
- Select an edge in order while checking this selection forms a cycle.
- Put a selected edge to the result MST set.
- Repeat from 2 until MST is completed.
This is an image as an example.
If $E$ is the number of edges, the sorting process takes $O(E\log{E})$.
As I mentioned before, checking cycle takes very small amount ot time and we assume that it does not affect the overall running time.
Running through all edges takes $O(E)$, so the running time of this algorithm is $O(E\log{E})$.
Prim’s algorithm
- Choose an arbitrary starting point and put all edges into the edge set which departs from this vertex. (MST should contain all vertices so choosing any vertex does not change the result.)
- Move to the next vertex by selecting the smallest edge in the set. Of course, there should not be a cycle.
- After reaching a new vertex, include new edges in the set which are connected to that new vertex.
- Repeat from 2 until we have a MST.
This is an image as an example.
When finding the smallest edge in the set, it takes $O(\log{E})$ if we use a priority queue.
And we repeat that until all vertices are connected, so the iterations are executed $V$ times.
The running time of this algorithm is $O(V\log{E})$.
For wrapping up, each time complexity of algorithm is below: if $V$ is the number of vertices and $E$ is the number of edges,
- Kruskal: $O(E\log{E})$
- Prim: $O(V\log{E})$
So we can think that if there are many edges, which is a dense graph, Prim’s algorithm is more helpful and in the opposite situation, Kruskal’s algorithm is more efficient.