Charging Station: Solving Frequent Words with Mismatches Problem

This lesson uses some notation from Charging Station: The Frequency Array.

To prevent having to generate all 4k^{k} k-mers in order to solve the Frequent Words with Mismatches Problem, our goal is to consider only those k-mers that are close to a k-mer in Text, i.e., those with Hamming distance at most d from this k-mer. Given a k-mer Pattern, we, therefore, define its d-neighborhood Neighbors(Pattern, d) as the set of all k-mers that are close to Pattern. For example, Neighbors(ACG, 1) consists of ten 3-mers:

 \space \space \space \space \spaceACG  \space \space \space \space \space CCG  \space \space \space \space \spaceGCG  \space \space \space \space \spaceTCG  \space \space \space \space \spaceAAG  \space \space \space \space \spaceAGG  \space \space \space \space \spaceATG  \space \space \space \space \spaceACA  \space \space \space \space \spaceACC  \space \space \space \space \spaceACT

Exercise Break: Estimate the size of Neighbors(Pattern, d).

Frequency array with up to d mismatches

Get hands-on with 1200+ tech skills courses.