Data Structures | Disjoint-Set(Union Find)
In this article we will talk about Union Find.
Let’s imagine that we have set of N elements, and some elements are connected. Elements that are connected produce some subset. Your question is to find number of distinct subsets or to find set with maximum number of members.
Case #1
There are n cities and k connections. Connected cities(directly or indirectly) declares province. Find total number of province.
Constraints:
1 | 1 <= n <= 10000 |
1 | Example #1 |
From example above you can see that provinces are:
- 1, 2, 3, 5, 6
- 4
- 7, 8
- 9, 10
In total we have 4 provinces.
Efficient approach to solve this problem is to use Union Find DS. The main idea is
Create unique ids array for each element in set. It means if we have 10 elements in set, unique ids are nums[1,2,3,4,5,6,7,8,9,10].
Define root element. Root element is an element where nums[i] = i. Initially all elements are roots. We have 10 roots. If we have nums = [1, 1, 1, 1, 5, 5, 5, 5, 9, 10], it means root elements are 1, 5, 9, 10, moreover if we have nums = [1, 1, 1, 1, 1, 1, 5, 5, 9, 10], then root elements are 1, 9, 10 (because nums[5] =1)
Implement “find” function. “Find” function searches root of element.
1
2
3
4
5
6public static int find(int[] nums, int i) {
while(nums[i] != i) {
i = nums[i];
}
return i;
}Implement union function. Union function connect to elements. How do we do this? Find root of first element(root1), find root of second element(root2). Setup nums[root1] = root2
1 | public static void union(int[] nums, int i, int j) { |
Let’s define final algorithm for the question:
- Create unique ids array(nums)
- Iterate through connection array and union cities
- Find total number of roots in nums.
Solution:
1 | package com.company; |
Case #2
Given an unsorted array of integers arr, find the length of longest consucutive elements sequence.
1 | Example #1 |
If you think about this problem in Union Find way, you can get that all consecutive elements sequece define some subset(connected component) and all we need to do is to find the biggest one. In general we need to iterate through array and try to union elements (val, val + 1) and (val, val - 1) if they given. It means we need to know whether there exists values (val + 1) or (val - 1) in order to retrive their indexes. This can be done via projecting values to indexes. Map would be effecient way. What happens if you have duplicates? Actually it doesn’t matter, because we store only the last occurence of value in map and we need keep in mind to process indexes that are exists in map. How we can get maximum size set? We can create another array counter, that will keep track size of each root’s set.
Solution:
1 | package com.company; |
Both problems can be found in leetcode.com.