5 Minimum Spanning Tree

Minimum Spanning Tree (MST) problem is a global problem since it needs at least $Ω (D)$ rounds in a network of diameter $D$ , i.e., the computation needs the entire graph to be traversed.

The MST is an important and commonly occurring primitive in the design and operation of communication networks. In practice, the weights can be used to model delay, congestion etc. and hence an MST gives a spanning tree that minimizes total delay, congestion etc. One of the most common applications of MST is that can serve a backbone for efficient communication, e.g., it can be used naturally for broadcast.

We assume the clean network model and the synchronous CONGEST model. At the end of the distributed MST algorithm, each node will know which of its incident edges belong to the MST and which do not.

We also assume, for simplicity that all edge weights in the graph are distinct. This constraints the graph from having more than one MST. This assumption is without loss of generality, since one can tag each edge weight with an additional identifier that can be used to break ties.

As in centralized MST algorithms, distributed MST algorithms also rely on two important properties of a MST:

Cut property: A cut in a graph is a partition of the vertex set into two disjoint sets. The cut property states that, given any cut in a graph, the lightest (minimum weight) edge crossing the cut belongs to the MST (due to the assumption of unique edge weights, there is a unique lightest edge crossing the cut.)
Cycle property: Consider any cycle in the graph. The heaviest (maximum weight) edge in the cycle will not be in the MST.

Gallager-Humblet-Spira (GHS) Algorithm:

We are given an undirected, connected, weighted graph $G = (V, E, w)$ . Let $n$ be the number of nodes and $m$ be the number of edges of $G$ . Let $T$ be the MST on $G$ . A MST fragment $F$ of $T$ is defined as a connected subgraph of $T$ i.e., $F$ is a subtree of $T$ .

An outgoing edge of a MST fragment is an edge where one adjacent node to the edge is present in the fragment and the other is not. As an immediate consequence of the cut property, the minimum outgoing edge (MOE) of a fragment $F$ is an edge of the MST.

The GHS algorithm operates in phases. In the first phase, the GHS algorithm starts with each individual node as a fragment by itself and continues till there is only one fragment which is the MST. All fragments find their MOE simultaneously in parallel.

In each phase, the algorithm maintains the following invariant: Each MST fragment has a leader and all nodes know their respective parents and children. The root of the tree will be the leader. Initially, each node is a root node of the fragment, and each fragment is identified by the identifier of the root, called the fragment ID. Each node in the fragment knows its fragment ID.

Description of GHS Algorithm:

The GHS algorithm assigns a level to each fragment, which is a non-decreasing integer with initial value 0. Furthermore, each fragment with a non-zero level has an ID, which is the ID of the core edge/root in the fragment, which is selected when the fragment is constructed. During the execution of the algorithm, each node can classify each of its incident edges into three categories:

Branch edges are those that have been determined to be part of the MST.
Rejected edges are those that have been determined not to be part of the MST.
Basic edges are all edges that are neither branch edges nor rejected edges.

In level-0 fragments, each awakened node will do the following:

Choose its minimum-weight incident edge and mark that edge as a branch edge.
Send a message via the branch edge to notify the node on the other side.
Wait for a message from the other end of the edge.

The edge that is chosen by the two nodes it connects becomes the core edge, and is assigned level 1.

In non-zero-level fragments, a separate algorithm is executed in each level. This algorithm can be separated into three stages: broadcast, convergecast, and change core.

Broadcast:

The two nodes adjacent to the core broadcast messages to the rest of the nodes in the fragment. The messages are sent via the branch edge but not via the core. Each broadcast message contains the ID and level of the fragment. At the end of this stage, each node has received the new fragment ID and level.

Convergecast:

In this stage, all nodes in the fragment cooperate to find the minimum weight outgoing edge of the fragment. The message sent in this stage are opposite direction to that of the broadcast stage. Each node upon receiving all the MOE from its children will find the minimum among them along with its MOE and send the result to its parent.

Change core:

After the completion of the previous stage, the two nodes connected by the core can inform each other of the best edges they received. Then they can identify the minimum outgoing edge from the entire fragment. A message will be sent from the core to the minimum outgoing edge via a path of branch edges. Finally, a message will be sent out via the chosen outgoing edge to request to combine the two fragments that the edge connects. Depending on the levels of those two fragments, one of two combined operations are performed to form a new fragment.

Finding the minimum-weight incident outgoing edge:

As discussed above, every node needs to find its minimum weight outgoing incident edge after the receipt of a broadcast message from the core. If node $n$ receives a broadcast, it will pick its minimum weight basic edge and send a message to the node $n^{'}$ on the other side with its fragment’s ID and level. Then, node $n^{'}$ will decide whether the edge is an outgoing edge and send back a message to notify node $n$ of the result. The decision is made according to the following:

$F_{ID} (n) = F_{ID} (n^{'})$ , Then nodes $n$ and $n^{'}$ belong to the same fragment
$F_{ID} (n) \neq = F_{ID} (n^{'})$ and $L (F) \leq L (F^{'})$ , $F$ and $F^{'}$ are corresponding fragments and $L (F)$ is the level of fragment $F$ , then nodes $n$ and $n^{'}$ belong to different fragments, so the edge is outgoing.
$F_{ID} (n) = F_{ID} (n^{'})$ and $L (F) > L (F^{'})$ , Here we cannot make any conclusion. The reason is that two nodes may belong to the same fragment already, but $n^{'}$ has not discovered this fact yet due to the delay of a broadcast message . In this case, the algorithm lets node $n^{'}$ postpone the response until its level becomes higher than or equal to the level it received from node $n$ .

Combining two fragments:

Let $F$ and $F^{'}$ be the two fragments that need to be combined.

Merge: If both $F$ and $F^{'}$ share a common MOE and $L (F) = L (F^{'})$ then combine. The level of the combined fragment will be $L (F) + 1$ .
Absorb: If $L (F) < L (F^{'})$ then combine. The combined fragment will have its level as $L (F^{'})$ .
Wait: Wait till the above rules apply.

Pseudocode:

Variables:

state : sleep , find , found
- sleep - The node is not initialized
- find - The node is currently helping its fragment search for MOE
- found - MOE for the fragment has been found
status $[q]$ : basic , branch , reject
- basic - Edge is not yet determined
- branch - Edge is a part of the MST
- reject - Edge is not a part of the MST
name - Name of the fragment
level - Level of the fragment
parent - Points towards the combining edge
bestWt , bestNode , rec , testNode - temporary variables

Current node $p$ , Neighbor is $q$

Initialization:

1    pq is the least weight 
2    status[q] = branch 
3    level = 0 
4    state = found 
5    rec = 0 
6    send <connect , 0> to q

Connecting Message:

1     Receive <connect,L> from q:
2     if L < level then:
3 	      // Combine directly with root of p as the combined root
4 	      status[q] = branch
5	      send <initiate , level , name , state> to q
6
7     else if status[q] = basic then 
8	      wait
9
10    else 
11	    // Combine but increase the rank/level by 1
12	    send <initiate , level + 1 , pq , find> to q

Initiate Message:

1    Receive <initate , level' , name' , state'> from q
2	level = level'
3	state = state'
4	name = name'
5
6	bestNode = {} 
7	bestWt = INF
8	testNode = {}
9
10	for each r in adj[p]: // adj[p] - adjacency list of p
11		if (status[r] = branch && (r !=q)) then:
12			send <initiate , level' , name' , state'> to r
13
14	if state = find then:
15		rec = 0 
16		findMin()

findMin() Function:

1	if there exists a q in adj[p] such that status[q] = basic and weight of pq is minimal then:
2		testNode = q
3		send <test , level , name> to testNode
4
5	else
6		testNode = {}
7		report()

Test message:

1	Receive <test , level' , name'> from q
2
3	if level' > level then
4		wait
5	
6	if name = name' then
7		if status[q] = basic then
8			status[q] = reject
9
10		if q != testNode then
11			send <reject> to q
12
13		else
14			findMin()
15
16	else
17		send <accept> to q

Accept and Reject messages:

1	Receive <accept> from q:
2	testNode = {}
3	if weight of pq < bestWt then
4		bestWt = weight of pq
5		bestNode = q
6	report()
7	
8	Receive <reject> from q:
9	if status[q] = basic then:
10		status[q] = reject
11	
12	findMin()

report() Function:

1	report:
2	if((rec == Number such that status[q] == branch && q != parent) && testNode = {}):
3		state = found
4		send <report , bestWt> to parent

Report Message:

1	Receive <report , w> from q:
2	if q != parent then:
3		if w < bestWt then
4			bestWt = w 
5			bestNode = q
6		rec = rec + 1 
7		report()
8	
9	else:
10		if state == find:
11			wait
12	
13		else if w > bestWt then:
14			changeRoot()
15	
16		else if w = bestWt = INF:
17			stop

changeRoot() Function:

1	changeRoot():
2	if status[bestNode] = branch then 
3		send changeroot to bestNode
4	
5	else
6		status[bestNode] = branch
7		send <connect, level> to bestNode
8	
9	Receive changeroot:
10	changeRoot()

Message Complexity:

Every node is rejected only once, one test message and one reject message: Total $2∣ E ∣$ messages.

At every level, a node sends/receives at most:

receives - 1 initiate message , 1 accept message
sends - 1 report message , 1 changeroot/connect message , 1 successful test message

Since each node can be at at most $lo g n$ levels (Every level doubles the fragment size), therefore the message complexity is $2∣ E ∣ + 5 n lo g n$ .

Indefinite Tree

Explorer