AMBER Score

From DISI
Jump to: navigation, search

AMBER score can be used to rescore the ligands that have been already scored using one of the faster scoring protocols mentioned above. AMBER score implements MM GB/SA simulations with traditional all-atom AMBER force field Pearlman, et al. Comp. Phys. Commun. 1995 for protein atoms, and general AMBER force field (GAFF, Wang, et al. J. Comp. Chem. 2004) for ligand molecules. The interaction between the ligand and the protein is represented by electrostatic and van der Waals energy terms, and the solvation energy is calculated using Generalized Born (GB) solvation model. User has the option to choose one of the following GB models: (i) Hawkins, Cramer and Truhlar pairwise GB model with parameters described by Tsui and Case (gb=1) ( Tsui, et al. Biopolymers 2001), (ii) Onufriev, Bashford and Case model, GB(OBC) (gb=2) ( Onufriev, et al. Proteins 2004), and (iii) a modified GB(OBC) (gb=5) ( Onufriev, et al. Proteins 2004). The surface area term is derived using a fast LCPO algorithm ( Weiser, et al. J. Comp Chem 1999).

The AMBER score is calculated as:

E(Complex) - [E(Receptor) + E(Ligand)],

where E(Complex), E(Receptor) and E(Ligand) represents the energy of the complex, receptor and ligand, respectively.

During AMBER score calculation, the input coordinates and parameters of the complex, ligand and receptor are read into the system. After reading the input coordinates, minimization using conjugate gradient method is performed to remove the bad contacts. This is followed by MD simulation (Langevin dynamics at constant temperature), and a short minimization to get the final energy of the system. User has the option to specify the number of minimization and MD simulation steps in the dock input file. During the final step of energy calculation, a Surface area term is also added to the system. For multiple ligands and a single receptor, these steps are repeated for each ligand and the corresponding complex, while the energy of the receptor is calculated once for the rest of the scoring procedure.

In addition to ligand flexibility, AMBER score also allows a part of the receptor to be flexible during minimization and MD simulations, in order to reproduce the so-called "induced-fit". The three movable region options that users can choose from are: Ligand, Everything, and NAB atom expression. When Ligand option is chosen, only the ligand is allowed to move during minimization and MD simulation. When Everything option is chosen, all the atoms in the protein and ligand are allowed to move. NAB atom expression is based on the program Nucleic Acid Builder (NAB). A NAB atom expression is a character string that contains one or more patterns that match a set of atom names in a molecule. Atom expressions contain three substrings separated by colons. They represent the strand, residue and atom parts of the atom expression. Each subexpression consists of a comma separated list of patterns, or for the residue part, patterns and/or number ranges. Several atom expressions may be placed in a single character string by separating them with the vertical bar Patterns in atom expressions are similar to Unix shell expressions. Each pattern is a sequence of 1 or more single character patterns and/or stars. The star matches zero or more occurrences of any single character. Each part of an atom expression is composed of a comma separated list of limited regular expressions, or in the case of the residue part, limited regular expressions and/or ranges. A range is a number or a pair of numbers separated by a dash.

Some examples of NAB atom expressions are:

* :SER:
#Select all atoms in any residue named SER. All three parts are present but both the strand and atom parts are empty. The atom expression :SER selects the same set of atoms.
* ::C,CA,N,O
#Select all atoms with names C, CA, N or O in all residues in all strands (typically the peptide backbone).
* 1:1-10,13:CA,C,N
#Select all atoms named CA,C,N in residues 1-10 and 13 in strand 1.
* ::C*[^1]
#The [^1] is an example of a negated character class. It matches any character in the last position except 1. In this case, it will match all the atoms starting with C, such as CA, CB, CG2, but not those ending with 1, such as CD1, CE1.
* 2::|1:50,100:O*,N*
#Select all atoms in strand 2. Select all atoms whose name starts with O and N in residue 50, 100 in strand 1. Note that the vertical bar separates the two strands
* 4::|2::|1::
#Select strand 4, 2 and 1.
* :: or :
#Select all atoms in the molecule.

All the input files, such as the prmtop, frcmod, amber.pdb should be generated prior to running the AMBER score. A perl script, prepare_amber.pl [located in the bin directory], has been provided for this purpose. Usage of prepare_amber.pl:

{BIN}/prepare_amber.pl ligand_mol2_file receptor_PDB_file

For example, if lig.mol2 is ligand mol2 file, and rec.pdb is receptor PDB file, then use: {BIN}/prepare_amber.pl lig.mol2 rec.pdb The script, prepare_amber.pl, also has the ability to read in a mol2 file containing multiple ligands (usually the output from a previous DOCK run), and generate AMBER score readable input files. prepare_amber.pl calls for programs such as antechamber to calculate the AM1-BCC charges for the ligands, tleap to assign the parm94 parameter set for protein atoms, and GAFF for ligand atoms. See tutorials for information on how to use the script to generate the input files.

In the DOCK scoring hierarchy AMBER score follows the rest of the score. It should be used as the secondary scoring function. Since it is slower compared to the other scoring protocol mentioned above due to the intrinsically complex nature of all atom MD simulations, it is not advisable to use AMBER score as a primary score.

WARNING: At this point, verbose output from the amber_score function is printed to the screen (standard out). It cannot be turned off or directed to the output file defined by the -o option. If you would like to collect the verbose output, do not use the -o flag. Instead, use the command below:

dock6 -i dock.in > dock.out

If you do not want to collect the output, it will simply print harmlessly to the screen. We are aware that this situation is inconvenient and will work to fix it in the next release.

NOTE: The following parameter definitions will use the format below:

parameter_name [default] (value):
#description

In some cases, parameters are only needed (questions will only be asked) if the parameter above is enforced. These parameters are indicated below by additional indentation.

AMBER Score Parameters

* amber_score_primary [no] (yes, no):
#Flag to perform amber scoring as the primary scoring function
* amber_score_secondary [no] (yes, no):
#Flag to perform amber scoring as the secondary scoring function. This is temporarily deprecated, and using input parameter amber_score_secondary causes program termination. The recommended protocol is to perform two DOCK runs with the second run specifying amber_score as the primary_score.
o receptor_file_prefix[rec] (string):
#Prefix of the Receptor. Use the prefix that was used in prepare_amber.pl input file preparation step.
o amber_score_movable_region [ligand] (everything, ligand, nab_atom_expression):
#The region that should be allowed flexible while scoring.
+ nab_atom_expression [1::] (string):
#NAB atom description of movable region
o amber_score_gb_model [5] (int):
#GB model to be used
o amber_score_md_steps [3000] (int):
#Number of Molecular Dynamics steps to be performed
o amber_score_minimization_cycles [100] (int):
#Number of Conjugate gradient minimization cycles to be performed
o amber_score_nonbonded_cutoff [18] (int):
#Non-bonded cutoff in Angstrom units for the energy calculation
o amber_score_temperature [300] (int):
#Temperature at which MD should be performed
o amber_score_verbose [yes] (string):
#Detailed information on the screen/output file