Online Lectures on Bioinformatics
|
Phylogenetic Trees and Multiple AlignmentsApproximating additive metricsLet a distance matrix on the species be given. In practice, the distance matrix between molecular sequences will not be additive. A tree T is searched whose distance matrix approximates the given one. Although the true difficulty lies in finding the best topology one may formulate two alternative formalizations of the approximation task when the topology is given.
Let
Here the above example again:
![]()
The distance matrix on a set of objects constitutes
an additive metric
whereat the distance vector d contains the distances between the objects.
As mentioned above, the distance matrices on biological data
generally constitute biased additive metrics.
Thus the general equation for an arbitrary distance vector
e
and for a given tree topology
Alternatively, Waterman et al. have formulated a linear
program. The objective function of the linear program is similar in spirit
to the Steiner tree setup: the overall length of the tree is minimized. One
constraint is that the edges should have positive lengths. The other
constraint is derived from the fact that the alignment distance between two
sequences underestimates the number of changes that have occurred between
the two sequences during their development. This results in the inequality
Comments are very welcome. luz@molgen.mpg.de |