previous section previous page next page next section

Online Lectures on Bioinformatics


Algorithms for SP-optimal multiple alignments


Regarding more than two sequences, the generalization of a pairwise alignment leads to a multiple alignment which may be represented by a matrix:

        - A C C G T C

        T A G - G T A

        C A C C C T -

The number of rows in the matrix corresponds to the number of sequences, the number of columns is the resulting length of the alignment.

Formally this means:

Given the sequences s(1), s(2),...,s(n) over an alphabet $ \mathcal{A} $. Then the matrix

\begin{displaymath}A = (A_{ij})_{1 \leq i \leq n,1 \leq j \leq N} = (\vec{x}_{j})_{1 \leq j \leq N} \end{displaymath}

representing the multiple alignment has to fulfill the following conditions:

$ A_{ij} \in \mathcal{A} \ \cup \ \{-\} $
Omitting the blanks, the i th column reproduces the sequence s(i).
There is no row consisting of blanks only.

Comments are very welcome.