De Bruijn graphs: Comparison with Overlap Graphs

Let’s look into the possible conditions for choosing a specific graph.

De Bruijn vs. overlap graphs

We now have two ways of solving the String Reconstruction Problem. We can either find a Hamiltonian path in the overlap graph or find an Eulerian path in the de Bruijn graph (figure below). Your inner voice may have already started complaining: was it really worth my time to learn two slightly different ways of solving the same problem? After all, we’ve only changed a single word in the statements of the Hamiltonian and Eulerian Path Problems, from finding a path visiting every node exactly once to finding a path visiting every edge exactly once.

STOP and Think: Which graph would you rather work with, the overlap graph or the de Bruijn graph?

Which graph to work with

Our guess is that you would probably prefer working with the de Bruijn graph since it’s smaller. However, this would be the wrong reason to choose one graph over the other. In the case of real assembly problems, both graphs will have millions of nodes, and so all that matters is finding an efficient algorithm for reconstructing the genome. If we can find an efficient algorithm for the Hamiltonian Path Problem, but not for the Eulerian path Problem, then you should select the overlap graph even though it looks more complex.

The choice between these two graphs is the pivotal decision of this chapter. To help you make this decision, in the next lesson, we’ll ask you to hop on board our bioinformatics time machine for a field trip to the 18th Century.

Get hands-on with 1200+ tech skills courses.