previous section previous page next page next section
CMB

Online Lectures on Bioinformatics

navigation


Variants of the dynamic programming algorithm


Suboptimal Alignments

As an example for the different alignment variants consider the comparison of estrogen receptor and some other receptor.

  • Dot plot
  • global alignment
  • smith waterman alginment
Compared to the dot plot the local alignment is missing a region. What is required is to find other local alignments. This is achieved by the Waterman-Eggert algorithm which computes suboptimal, local, and non-overlapping alignments. It starts with the application of the Smith Waterman algorithm. After having computed the (best) local alignment from the dynamic programming matrix the cells along this alignment and all cells from which backtracking would lead onto this alignment are eliminated. This can be seen as "resetting" the dynamic programming matrix after having deleted the first alignment. Then the second best alignment is identified. The resetting procedure is repeated and so on. Iteration of this procedure yields one alternative, non-overlapping alignement after the other in order of descending quality. Suboptimal alignments are discussed more detailed in the previous section.

Alignment Quiz: 4.
Alignment Quiz: 4.


There is an interesting interplay between parameters, in particular the gap penalty, and the algorithmic variant used. Consider a pair of sequences whose similar regions can in principle be strung together into an alignment (as opposed to sequences containing repeats which cannot all be seen in a single alignment). Under a weak gap penalty the Smith-Waterman algorithm has a chance to identify this entire alignment. On the other hand, not knowing about the similarity between the sequences ahead of time, a weak gap penalty might also cause all kinds of spurious aligned regions. The Waterman-Eggert algorithm is a valid alternative. The gap penalty can be chosen fairly stringently. The first (i.e., the Smith-Waterman) alignment will then identify only the best matching region out of all the similar regions. By iterating the procedure, though, this algorithm will successively identify the other similar regions as well.


Comments are very welcome.
luz@molgen.mpg.de