Charging Station: The Frequency Array
Learn how we can make the FREQUENTWORDS algorithm work faster, transform patterns, and different algorithms for frequent words problem.
To make FrequentWords faster, we’ll think about why this algorithm is slow in the first place. It slides a window of length k down Text, identifying a k-mer Pattern of Text at each step. For each such k-mer, it must slide a window down the entire length of Text in order to compute PatternCount(Text, Pattern). Instead of doing all this sliding, we aspire to slide a window down Text only once. As we slide this window, we’ll keep track of the number of times that each k-mer Pattern has already appeared in Text, updating these numbers as we proceed.
Ordering 4 k-mers
To achieve this goal, we’ll first order all 4 k-mers lexicographically (i.e., according to how they would appear in the dictionary) and then convert them into the 4 different integers between 0 and 4. Given an integer k, we define the frequency array of a string Text as an array of length 4, where the i-th element of the array holds the number of times that the i-th k-mer (in lexicographic order) appears in Text.
Get hands-on with 1200+ tech skills courses.