previous section previous page next page next page

Online Lectures on Bioinformatics


Protein Structure

Secondary structure prediction

Linus Pauling [1] already suggested that amino acid chains could assume regular local structures, namely alpha helices and beta strands. In between these secondary structure elements there are turns or loops. There is a long tradition of attempts to predict local secondary structure based on sequence. State-of-the-art secondary structure prediction generally observes the frequencies of occurences of k-tuples in particular secondary structures. Based on this statistic prediction can be made for a new sequence.

Chou and Fasman [6] apply a basic log-odds approach for the occurences of single amino acid residues in the sequence, while the GOR method [7] which is based on information theory uses all possible pair frequencies within a sliding window.

As long as one restricts to the problem to the prediction for a single sequence there seems to be an inherent limit in prediction accuracy of around 65%. Multiply aligned sequences offer a means to surpass this limit. The PHD-method [8] uses evolutionary information from multiple sequence alignments in a multi level system of neural networks. Due to the auuthors, the average accuracy of PHD-method is greater than 72%.

exercise 2
exercise 2

There is a fairly well-accpted method of validating a new secondary structure prediction methods. Most authors of methods therefore report the success rate of their procedure. The GOR- and the PHD-method, additionally, supply the user with an estimate of how reliable a prediction in a particular area is. An obvious approach would be to overlay the output from several secondary structure prediction programs but it is doubtful whether this strategy will actually improve the situation. Only if the individual methods are sufficiently different does one actually gain information through such an approach.

Comments are very welcome.