A disjoint set is a forest data structure, in which there might be multiple trees. A disjoint set contains a set of elements that can be partitioned into subsets. A subset is one tree. There is one element at the root position which is a representative of this subset. Other elements are children of this root. So one subset can be seen as a nary tree. Multiple subsets form a disjoint set.

A disjoint set uses union and find methods to construct. So the data structure is also called union-find. Multiple subsets can be merged into a bigger subset when you call a union method to link two elements in two subsets. A find method is used to find the root of the subset.

A disjoint set is usually used as an alterative solution for graph problems, such as finding suggested friends in social media or minimum spanning tree of a graph.

### Build a disjoint set

A disjoint set consists of elements. Each element has a value and a parent pointer to its parent element. An element is also associated with a rank, which identifies the element’s position in a set. So in a DisjointSet class, there are two attributes, parent and rank. Both are maps, in which the keys are elements. From the element, you can find its parent and rank.

makeSet() is a method to initialize parent and rank from a list of elements. Initially, each element can be seen as a subset. Its parent is itself as well. All element’s rank is zero.

## Python

### Find

Find method is to find the element’s root. It follows the chain of parent pointers up the tree until it reaches the root. Save this root as the value in parent. Multiple elements can share the same root. This root element is the representative member of the set to which elements belong.

Path compression is a technique that uses recursions to flatten the structure of the tree by making element’s parent pointing to the root. The root is also part of the set. The flattened tree reduces the step to find from element to its root. The time complexity is O(α(n)), where α(n) is the extremely slow-growing inverse Ackermann function. This is better than O(logn). The space complexity is O(α(n)) as well because it uses recursion for function’s call stack.

## Python

### Union

Union method is to make two elements into the same set. It uses find() method to find the roots of both elements. If they belong to different sets, the sets are combined by attaching one of the root to the other one as a child.

Union by rank is another technique to keep the height of the tree as small as possible. Union by rank attaches the shorter tree to the root of the taller tree. The element’s rank is 0 initially. If two sets are combined and have the same rank, the resulting set’s rank increases by 1. If two sets have different ranks, the resulting set’s rank is the larger one of the two. The time complexity and space complexity is O(α(n)).

## Python

### Print

Print is a utility method to print out elements in each subsets. Each subset has a root (or parent) and all elements in this set. So the disjoint set is printed out like a hash map.