Open Problem: Computing Probabilities of Patterns in a String

Learn how we can compute probabilities of patterns in a string by observing different probability patterns.

Overlapping words paradox

In the main text, we told you that the probability that a random DNA string of length 500 contains a 9-mer appearing three or more times is approximately 1/1300. In DETOUR: Probabilities of Patterns in a String, we describe a method to estimate this probability, but it’s rather inaccurate. This open problem is aimed at finding better approximations or even deriving exact formulas for probabilities of patterns in strings.

Get hands-on with 1200+ tech skills courses.