Peculiar Statistics of the Forward and Reverse Half-Strands

Get some insights about the different nucleotide frequencies on the reverse and forward half-strands.

We'll cover the following

Frequency switch and the ori location

The figure given below (upper left) reveals a surprising pattern. We have partitioned the E. coli genome into 46 equally sized fragments of approximately 100,000 nucleotides, starting at the experimentally verified terminus of replication, and then computed the frequency of cytosine in each window. The first 23 fragments (starting from ter) represent the reverse half-strand, and the last 23 fragments (starting at ori) represent the forward half-strand (see the figure in the Unidirectionality from DNA polymerase section). Most fragments on the reverse half-strand have a high cytosine frequency (above 25%), whereas most fragments on the forward half-strand have a low cytosine frequency (below 25%). In contrast, as the figure below (upper right) illustrates, most fragments on the reverse half-strand have a low guanine frequency (below 25%), whereas most fragments on the forward half-strand have a high guanine frequency (above 25%).

The figure given below (bottom left) shows the difference in frequencies of G and C in each genome fragment and presents an even more striking visualization of the peculiar statistics of nucleotide frequencies on the reverse and forward half-strands. Even if we assume that we don’t know the location of ori in advance, the pattern still presents itself when starting at an arbitrary position of the E. coli genome (bottom right).

If the pattern that we’ve found in the figure below isn’t a statistical fluke, then we’ve uncovered a hint about how to find ori — we can simply walk along the genome and check where the difference between the frequency of guanine and cytosine switches from negative to positive! But why in the world would such a simple test allow us to find the replication origin of a bacterium?

Get hands-on with 1200+ tech skills courses.