Say we have a graph $G = (V,E)$, that is a set of vertices and a set of edges. Each edge have a weight-function, $w$, associated to it, and thus we can obtain the total weight of $E$ by $$ w(E) = \sum_{u,v \in E} w(u,v) $$ OK. We want to find the minimum weight of edges that span $G$. Denote this set by $T$, for which we have $T \subset E$. We want $T$ to be acyclic and fully connected, and hence $T$ forms a tree.

We introduce the following terminology. Let $T$ be MST, and let $A \subset T$.

__Safe edge__: For some edge $(u,v)$, if $A \cup (u,v) \subset T$, then $(u,v)$ is a safe edge.__Cut__: A cut, $(S, V - S)$ of $G = (V,E)$ is a partition of $V$ into two disjoint sets.__Crosses__: An edge $(u,v)$ crosses a cut if one of its endpoints is in $S$ and the other is in $V - S$.__Respects__: A cut respects a set $A$ of edges if no edge in $A$ crosses the cut.__Light edge__: An edge $(u,v)$ crossing a cut is a light edge if its weight is a minimum of all edges crossing that cut.

Let $G = (V,E)$ be some undirected graph with a weight function $w$ defined on the set of edges $E$. Let $A$ be a subset of $E$ that is included in some MST for $G$. Let $(S, V - S)$ be a cut of $G$ that respects $A$. Let $(u,v)$ be a light edge crossing that cut. Then edge $(u,v)$ is safe for $A$.

- Let $T$ be a MST that includes $A$
- Let $T$ not contain the light edge $(u,v)$ - else wise we are done.
- Let $(x,y)$ be an edge that crosses the cut $(S, V - S)$ and for which we have $(x,y) \in T$. If we add $(u,v)$ to $T$, $(u,v)$ and $(x,y)$ would form a cycle.
- Remove $(x,y)$ from $T$ thus dividing $T$ into 2 disjoint components.
- Add $(u,v)$ to $T$ thus joining $T$ into a single unit again. Call this new spanning tree for $T'$, that is $T' = T - (x,y) \cup (u,v)$.
__Proof that T' is MST__- Since $(u,v)$ is a light edge we have by def. that $w(u,v) \leq w(x,y)$ and hence $$ w(T') = w(T) - w(x,y) + w(u,v) \leq w(T) $$
- But since $T$ is MST, we also have that $w(T) \leq w(T')$ and thus $w(T) = w(T')$. $T'$ must be a MST also.
__Proof that (u,v) is safe for A__- We have that $A \subset T'$ since $A \subset T$ and $(x,y) \not \in A$. The last part follows since the cut $(S, V - S)$, which $(x,y)$ crosses, is assumed to respect $A$.
- Now $A \cup (u,v) \subset T'$.
- Thus since $T'$ is MST, we can safely add $(u,v)$ to $A$ and keep the property of MST.

From theorem 23.1 follow that we can add light edges to a MST and obtain the property of MST.

The alg. searches the edges that connect any two trees in the forest for the one with least weight. Since we by the above know that the such an edge is safe, the alg. is correct. This is a greedy approach. For code we get

That is if two edges belong to the same set, the result of unioning them will result in a cycle. If they do not, we union them. Before the iteration we sort the edges by weight in non-descending order. The one with least weight goes first.

__Running time__ of Kruskal can be obtained as follows. Implement the disjoint-set datastructure as disjoint-set-forest with union-by-rank and path compression heuristics. Now we have

- Initialize the set $A$ takes $O(1)$ time.
- Sorting the edges takes $O(E lg E)$.
- The last for loop performs $O(E)$ Find-Set operations. Along with the $|V|$ Make-Set this amounts to $O((V + E) \alpha(V))$ - $\alpha$ here is a very slowly growing function.
- Since $G$ is assumed connected, we have $|E| \geq |V| - 1$, and hence the above operations take $O(E \alpha(V))$ time.
- Now since $\alpha(|V|) = O(lg V) = O(lg E)$, the running time is $O(E lg E)$.
- Since $|E| < |V|^2$, we have $lg |E| = O(lg V)$ and thus the running time can be restated as $O(E lg V)$.

This alg is in many ways like Dijkstras for finding shortest path. They both use a min-priority-heap. The alg adds light edges to $A$ starting from an arbitrary root.

__Running Time__. If we use a min-priority-heap, we can use Build-Min-Heap procedure to perform lines 1-5 in $O(V)$ time. Furthermore we get

- The body of the while loop executes $|V|$ times, and the Extract-Min takes $O(lg V)$. In total that is $O(V lg V)$
- The for loop is $O(E)$ times, since the length of all adjacent lists is $2|E|$.
- The last line assignment involves a Decrease-Key, that is $O(lg V)$.
- In total we now get $O(V lg V + E lg V) = O(E lg V)$. The same as for Kruskal.

By using Fibonacci heaps we can improve the running time to $O(E + V lg V)$.

CommentsGuest Name:Comment: