Online Lectures on Bioinformatics

Protein StructureSecondary structure predictionLinus Pauling [1] already suggested that amino acid chains could assume regular local structures, namely alpha helices and beta strands. In between these secondary structure elements there are turns or loops. There is a long tradition of attempts to predict local secondary structure based on sequence. Stateoftheart secondary structure prediction generally observes the frequencies of occurences of ktuples in particular secondary structures. Based on this statistic prediction can be made for a new sequence. Chou and Fasman [6] apply a basic logodds approach for the occurences of single amino acid residues in the sequence, while the GOR method [7] which is based on information theory uses all possible pair frequencies within a sliding window. As long as one restricts to the problem to the prediction for a single sequence there seems to be an inherent limit in prediction accuracy of around 65%. Multiply aligned sequences offer a means to surpass this limit. The PHDmethod [8] uses evolutionary information from multiple sequence alignments in a multi level system of neural networks. Due to the auuthors, the average accuracy of PHDmethod is greater than 72%.
exercise 2
There is a fairly wellaccpted method of validating a new secondary structure prediction methods.
Most authors of methods therefore report the success rate of their procedure.
The GOR and the PHDmethod, additionally, supply the user with an estimate of how reliable a prediction
in a particular area is. An obvious approach would be to overlay the output from
several secondary structure prediction programs but it is doubtful whether
this strategy will actually improve the situation. Only if the individual
methods are sufficiently different does one actually gain information through
such an approach.
Comments are very welcome. luz@molgen.mpg.de 