- The structure of a protein is determined by its amino acid sequence.
- During evolution, the structure is more stable and changes much slower than the associated sequence.
- Template recognition and initial alignment: using BLAST or FASTA in the safe zone,
- Alignment correction: there are many pathological example, therefore it should be used a "multiple sequence alignment", for instance with CLUSTALW,
- Backbone generation: copy the coordinates of those template residues that show up in the alignment with the model sequence, it is better to choose a template with the fewest errors in the PDB. It is possible multiple template modeling (Swiss-Model), although it is more complex,
- Loop modeling: conformational changes cannot happen within regular secondary structure elements (helices and strands), therefore insertions or deletions due to gaps should be placed in loops and turns. These changes are notoriously difficult to predict. Two approaches: knowledge based: 3D-Jigsaw, Insight, Modeller, WHAT IF and energy based: Monte Carlo or MD techniques,
- Side-chain modeling: we can copy conserved residues entirely from the template to the model only at high level of sequence identity. Practically, side-chain placements are at least partly knowledg-based, libraries of common rotamers. There is a combinatorial explosion, which can be handled by the fact that certain backbone conformations strongly favor certain rotamers. Prediction accuracy is low for residues on the surface,
- Model optimization: an iterative process: predict the rotamers, then the resulting shifts in the backbone, then the rotamers for the new backbone, and so on, until the procedure converges. Rotamer prediction and energy minimization. Two eays to achieve greater accuracy: quantum force fields and self-parameterizing force fields (adaptive),
- Model validation: errors on sequence identity and in the template. Two ways to estimate errors in a structure:
- calculating the model's energy based on a force field, check if the bond lengths and bond angles are within normal ranges, and if there are lots of bumps in the model;
- normality indices that describe how well a given characteristic of the model resembles the same characteristic in real structures: bond lengths, and bond and torsion angles; distribution of polar and apolar residues; potential of mean force; 3D distriution functions (considering direction of atomic contacts)