Structure prediction in 1D
Limited computing resources and experimental inaccurices prevent prediction of protein structure from first principles. Therefore, the only succesful structure prediction tools are knowledge-based, using a combination of statistical theory and empirical rules.
Secondary Structure Prediction Methods
- Basic concept: segments of consecutive residues have preferences for certain secondary structure states: a pattern recognition problem (helix, strand or coil or loop). Physicochemical principles, rule-based devices, expert systems, graph theory, linear and multilinear statistics, nearest-neighbor algorithms, molecular dynamics, neural networks. The main limitation is the use of only local information, which is estimated to play for roughly 65% of secondary structure formation. To improve predictions, it is key the use of evolutionary information.
- Programs: PHD, JPred2 (JNet, NSSP, PREDATOR, PHD), PSIRED, SSPro2, HMMSTR/I, etc.
- Specialized methods: coiled-coil predictions. A coiled coil is a bundle of several helices assuming a side-chain packing geometry ("knob-into-holes"). COILS.
- Basic concept: try different arrangements and assess them by predicting the extent to which a residue embedded in a protein structure is accesible to the solvent. PDH, PROFphd, JPred2 server.
- Transmembrane proteins still represent a challenge. They do not crystalize, and are hardly tractable by NMR spectroscopy. Prediction is simplified by the fat of the lipid bilayer of the membrane, which reduced the degrees of freedom making the prediction almost a 2D problem.
- Basic concept: TM helices are predominantly apolar and between 12 and 35 residues long, globular regions between membrane helices are typically shorter than 60 residues, most TMH proteins have a specific distribution of the positively charged amino acids Arginine and Lysine (the "positive-inside-rule").
- Programs: ToPred2, MEMSAT, TMAP, PHD, TMHMM, HMMTOP
- PHDsec, PROFsec : neural-network based prediction of secondary structure, accessibility and TMH.
PROF, multiple alignments and other characteristic from databases.
PSIpred: based on profiles created by psi-blast and neural networks.
SAM-T99 : neural network and HMM.
SCRATCH: uses SSPro (recursive bidirectional neural networks).
Personal working notes extracted from B. Rost, "Prediction in 1D: Secondary Structure, Membrane Helices, and Accesibility" in Structural Bioinformatics