Online Lectures on Bioinformatics
|
Algorithms for the comparison of two sequencesType II alignmentThe Needleman-Wunsch algorithm was subsequently modified by several authors, especially Sankoff [San72] and Sellers [Sel74]. Two distinctions were made then. Algorithms were formulated for type II alignments instead of type I and for distance measures (looking for minimal alignments) instead of similarity measures (looking for maximal solutions). The problem of similarity and distance will be discussed later. First the algorithm for type II alignments is discussed.
In the definition of the type II alignment the index 0 is used to represent the gap character.
This is not to be confused with an index of the point (0,0) which we used in the type I graph.
and
with
In the type II graph the vertices can again be seen as grid points (i,j), but The remaining procedure is just as before. A path in a type II graph corresponds to a type II alignment, the scores for the edges are now just the W(i,j), i.e. a penalty g for a vertical or horizontal arc and w(si,tj) for a diagonal one matching two residues. If end-gaps should not be punished the horizontal arcs on the top margin and the vertical ones on the left margin receive weight 0. In the case of two sequences this correspondence between alignment and graph is quite obvious. In the chapter on multiple alignment the type II graph for more than two sequences will be treated formally which then also applies to the above as a special case.
The optimization algorithm can then be applied to the type II graph,
resulting in
Backtracking is done as in algorithm A-I. Algorithm A-II-s corresponds to the algorithm formulated by Sankoff [San72] and Sellers [Sel74]. For a deeper understanding of the dynamic programming algorithm, you may explore the procedure by means of the Alignment Applet and the other exercises:
![]() Alignment Applet ![]() Alignment Quiz: 1. ![]() Dynamic Programming in Java
This algorithm does not involve looking back along a whole row or column
as it does not allow for gap penalty functions. The number of steps to
execute it therefore is three times the number of matrix entries that are
inspected. This gives the algorithm a time
complexity O(n2), where n denotes the length of the longer of the two
sequences.
Comments are very welcome. luz@molgen.mpg.de |