previous section previous page next page next section

Online Lectures on Bioinformatics


Suboptimal Alignments

Suboptimal points

Given two sequences with an optimal alignment score s. An is a type I alignment whose score is not less than . Any point on the path in the associated type I graph corresponding to such an alignment is called an . The set of all -suboptimal points will be called .
The following simple observation leads to a method to calculate the set of suboptimal points. Let a residue pair in the middle of a type I alignment be fixed. The score of the entire alignment is the sum of the scores from its left end to (and including) a pair plus the score of the remainder of the alignment (not counting the weight of the pair). The matrix L(i,j) that was defined in Algorithm A-I of Section 2.2 contains exactly the scores for alignments from the beginning of the sequences to a matrix point. In the current context we have to call this matrix L- but calculate it according to the same rule as before

Initial settings depend on whether end gaps are penalized or not. Generally L-(1,i) (L-(j,1)) is w(1,i) (w(j,1)) thus not penalizing gaps at the left end.

The score of an alignment proceeding from a matrix point to the end can be calculated by inverting the order of execution in the above algorithm (this corresponds to performing the same procedure on the dual of the type I graph). The resulting matrix L+ is calculated by

The order of computation is from (|s|, |t|) back along each row down to (1,1). Initialization has to be done in the same way as for the computation of L-, e.g. and . Adding L-(i,j) + L+(i,j) counts the score for the pair (i,j) twice. Therefore the score of the best type I alignment going through (i,j)is

L-(i,j) + L+(i,j)-w(i,j).

Based on this observation the -suboptimal points can be calculated as
( )

For linear gap penalty functions the computation of L- and L+ can be accelerated using the same improvement as in algorithm A-II-l of Chapter 2. An algorithm similar to SOP was proposed by Sellers ([Sel79],[Sel80]) for type II alignments and by Boswell and McLachlan [BoM84] for type I alignments.

It is easy to see that the maximal value assumed in must be the optimal alignment score s This value is assumed along all optimal alignments. On the other hand, not every alignment that consists of points in S0 is an optimal alignment . Neither is every alignment made up of points in , , itself -suboptimal. This gives rise to the next algorithm due to M. Waterman ([Wat83],[ByW84]) for finding all -suboptimal alignments.

Comments are very welcome.