Algorithms for the comparison of two sequences
Minimum distance alignment score as a metric on sequences
When the function d on the pairs of letters of the alphabet is a metric distance, defining a distance on the sequences as the score of the minimum distance alignment actually yields a metric on the space of sequences. This has been suggested by Ulam [Ula72] and was shown for single indels by Sellers [Sel74], and for general gap functions by Waterman et al. [WSB76]. Without giving the proof here, the argument for the first case can still be easily explained. The crucial point is to show that the distance on the sequences fulfills the triangle inequality, that is for s,t,u sequences over the alphabet . A minimum distance alignment can be viewed as the minimum number of changes that are necessary to convert one sequence into another. Assume that for three sequences s,t,u, d(s,u) + d(u,t) would be less than d(s,t). Then sequence u would be a possible intermediate sequence through which to go while changing s to t and the distance between them can only be d(s,u) + d(u,t) in the worst case.
This document was created by means of the LaTeX2HTML translator Version 98.1p1 release (March 2nd, 1998).
Comments are very welcome.