Minimum spanning tree

September 16, 2023·10 minutes of reading

A graph is a connected set

Developers use sets to uniquely store elements and avoid wasting storage and processing time. In mathematics a set is a collection of items that share the same type and are uniquely identifiable. For example R is the set of all real numbers.

Sets are of course not limited to primitive types. There can be a set of cities, planets, universe or any other type. The contract for defining a custom set is to uniquely distinguish any element in the set from rest. In Java and also Kotlin, this contract is defined as the implementation of the equals and hashCode methods. If you are Kotlin developer and using data classes then you don't have to explicitly define the aforementioned methods because they are autogenerated by the compiler.

Below is an example of a data class:

data class City(val id: Int, val name: String, val countryCode: String, latitude: Float, longitude: Float)

Getting back to the main topic of discussion, given a undirected graph made of a set of City. Each edge (or road) has an associated weight that measures how damaged the road is. As as a developer working on the software for managing cities maps, you are required to develop a feature for removing the most damaged roads from a given city. Your inputs are a list of Road(src, dest, damage) and the number of cities, and the output is the new list of roads.

fun removeDamagedRoads(roads: List<Road>, n: Int): List<Road> {
    // Implementation goes here
}

Solving the problem

There are two important sub-problems to solve:

Higher-weight roads must be removed if there is a lower weight path connecting the two cities.
Connected cities must be grouped together into subsets.

Solving the first sub-problem is easy. We sort the roads in an ascending order so we always process the least damaged roads first:

fun removeDamagedRoads(roads: List<Road>, n: Int): List<Road> {
    val sortedRoads = roads.sortBy(Road::damage)
    ...
}

As for the second point, the Disjoint-set data structure does a very good job of grouping sets.

A disjoint-set is similar to a linked list in the sense that it links its underlying elements together, except that the child-parent relationship is modelled via an identifier instead of an object reference for linked lists.

There are three operations in a Disjoint-set: create, find and union. For the latter we will use a union-by-rank approach.

class DisjointSet {

    companion object {
        const val NODE_NOT_FOUND = -1
        const val INITIAL_RANK = 0
    }

    private val parents = mutableMapOf<Int, Int>() // A mapping of nodeId to parentId
    private val ranks = mutableMapOf<Int, Int>() // A mapping of nodeId to rank (the higher the rank, the closer the node is to the root)

    fun create(nodeId: Int) {
        if (nodeId !in parents) {
            parents[nodeId] = nodeId // Initially each node is a parent of itself
            ranks[nodeId] = INITIAL_RANK
        }
    }

    fun find(nodeId: Int): Int {
        return when (nodeId) {
            !in parents -> PARENT_NOT_FOUND
            parents[nodeId] -> nodeId
            else -> find(parents[nodeId]!!)
        }
    }

    func union(node1Id: Int, node2Id: Int): Boolean {
        var root1 = find(node1Id)
        var root2 = find(node2Id)
        if (root1 != NODE_NOT_FOUND && root1 != root2) {
            var rank1 = ranks[root1] ?: INITIAL_RANK
            var rank2 = ranks[root2] ?: INITIAL_RANK
            if (rank2 > rank1) {
                // root1 should always be the highest ranking node
                root2 = root1.apply { root1 = root2 }
                rank2 = rank1.apply { rank1 = rank2 }
            }
            // Make root1 parent of root2
            parents[root2] = root1
            if (rank1 == rank2) {
                ranks[root1] = rank1 + 1
            }
            return true
        }
        return false
    }
}

The find and union operations both require O(N) time and space complexities since the former uses recursion and the call-stack consumes space and the latter is dependent upon the former.

We can now go back to our main function and update it as follows:

fun removeDamagedRoads(roads: List<Road>, n: Int): List<Road> {
    val sortedRoads = roads.sortBy(Road::damage)
    val disjointSet = DisjointSet()
    for (i in 0 until n) {
        disjointSet.create(i)
    }
    val updatedRoads = mutableListOf<Road>()
    for (road in updatedRoads) {
        if (disjointSet.union(road.src, road.dest)) {
            // The two cities were united under the same set so their road must be kept
            updatedRoads.add(road)
        }
    }
    return updatedRoads
}

The above function is in fact an implementation of the Kruskal's algorithm for finding the minimum spanning tree.

The core of the Kruskal's algorithm is the disjoint-set data structure which is a often used to partition sets.

That's it for this tutorial, I hope you learned something new. Let me know your thoughts, and whether you know a more efficient solution to the problem that we discussed.

Thanks for reading!