In this article we talk abount KMP algorithm
Let’s define a statement:
Find all occurrence of pattern in string.
By using naive approach we get answer of O(m * n)
complexity, where n
- length of text and m
- length of pattern. However applying KMP algorithm, this could be done of O(m + n)
asymptotic complexity and O(m)
space complexity.
Algorithm:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 import java.util.Arrays;class Kmp { public static void main (String[] args) { String text = "ABDAABAABCCDVCABCADEEQAAA" ; String pattern = "ABCA" ; search(text, pattern); } static void search (String text, String pattern) { System.out.format("Search [%s] in [%s]%n" , pattern, text); int [] lps = computeTable(pattern); int textIndex = 0 , textLen = text.length(); int patternIndex = 0 , patternLen = pattern.length(); while (textIndex < textLen) { if (text.charAt(textIndex) == pattern.charAt(patternIndex)) { ++textIndex; ++patternIndex; } if (patternIndex == patternLen) { System.out.format("Find match at index %s%n" ,textIndex - patternLen); patternIndex = lps[patternIndex - 1 ]; } else if (textIndex < textLen && pattern.charAt(patternIndex) != text.charAt(textIndex)) { if (patternIndex != 0 ) { patternIndex = lps[patternIndex - 1 ]; } else { ++textIndex; } } } } static int [] computeTable(String pattern) { int m = pattern.length(); int [] lps = new int [m]; int len = 0 ; int index = 1 ; while (index < m) { if (pattern.charAt(index) == pattern.charAt(len)) { lps[index++] = ++len; } else { if (len != 0 ) { len = lps[len - 1 ]; } else { ++index; } } } return lps; } }
Main idea is next:
Pre compute longest proper prefix which is also suffix(lps
)
Define textIndex, patternIndex
as iteration values in cycle
Iterate in text and do comparison between corresponding indexes of text and pattern:
If match case, increment both indexes of text and pattern
In case when index of pattern equals to lenth of pattern, it means we finded a match
In another case decrement corresponding index of pattern until we find a match or it will become a zero. In zero case increment text index