Revision as of 17:01, 27 January 2021 by Frodo (talk | contribs) (→‎Version 3.6 - circa 2010 to present)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The History of DOCK, longish version

Version 1.0/1.1 - circa 1982-1991

Authors: Robert Sheridan, Renee DesJarlais, Irwin Kuntz

Language: Fortran

The program DOCK is an automatic procedure for docking a molecule into a receptor site. The receptor site is characterized by centers, which may come from SPHGEN or any other source. The molecule being docked is characterized ligand centers, which may be its non-hydrogen atoms or volume-filling spheres calculated in SPHGEN. The ligand centers and receptor centers are matched based on comparison of ligand-center/ligand-center and receptor-center/receptor-center distances. Sets of ligand centers match sets of receptor centers if all the internal distances match, within a value of distance_tolerance. Ligand-receptor pairs are added to the set until at least nodes_minimum pairs have been found. At least three pairs must be found to uniquely determine a rotation/translation matrix that will orient the ligand in the receptor site. A least-squares fitting procedure is used (Ferro, et al Act. Cryst. A. 1977.). Once an orientation has been found, it is evaluated by any of several scoring functions. DOCK may be used to explore the binding modes of an individual molecule, or be used to screen a database of molecules to identify potential ligands.

Version 2.0 - circa 1991-1993

Authors: Brian Shoichet, Dale Bodian, Irwin Kuntz

DOCK version 2.0 was written to give the user greater control over the thoroughness of the matching procedure, and thus over the number of orientations found and the CPU time required (Shoichet, et al. J. Comp. Chem. 1992). In addition, certain algorithmic shortcomings of earlier versions were overcome. Versions 2.0 and higher are particularly useful for macromolecular docking (Shoichet, et al J. Mol. Biol. 1991) and applications which demand detailed exploration of ligand binding modes. In these cases, users are encouraged to run CLUSTER in conjunction with SPHGEN and DOCK.

To allow for greater control over searches of orientation space, the ligand and receptor centers are pre-organized according to their internal distances. Starting with any given center, all the other centers are presorted into “bins” based on their distance to the first center. All centers are tried in turn as “first” positions, and all the points in a bin which has been chosen for matching are tried sequentially. Ligand and receptor bins are chosen for matching when they have the same distance limits from their respective “first” points. The number of centers in each bin determines how many sets of points in the receptor and the ligand will ultimately be compared. In general, the wider the bins, the greater the number of orientations generated. Thus, the thoroughness of the search is under user control.

DOCK 3 Series

The DOCK 3 Series is coded mostly in Fortran, with some C.

Version 3.0 - circa 1992 - 1994

Authors: Elaine Meng, Brian Shoichet, Irwin Kuntz

Version 3.0 retained the matching features of version 2.0, and introduced options for scoring (Meng, et al. J. Comp. Chem., 1992). Besides the simple contact scores mentioned above, one can also obtain molecular mechanics interaction energies using grid files calculated by CHEMGRID (which is now superseded by GRID in version 4.0). More information about the ligand and receptor molecules is required to perform these higher-level kinds of scoring. Point charges on the receptor and ligand atoms are needed for electrostatic scoring, and atom-type information is needed for the van der Waals portion of the force field score. Input formats (some of them new in version 3.5) are discussed in various parts of the documentation; one example of a “complete format” (including point charges and atom type information) is SYBYL MOL2 format. Parameterization of the receptor is discussed in the documentation for CHEMGRID. In DOCK, ligand parameters are read in along with the coordinates; input formats are described below. Currently, the options are: contact scoring only, contact scoring plus Delphi electrostatic scoring, and contact scoring plus force field scoring. Atom-type information and point charges are not required for contact scoring only.

Version 3.5 - circa 1993-1998

Authors: Mike Connolly, Daniel Gschwend, Andy Good, Connie Oshiro, Irwin Kuntz

Version 3.5 added several features:

  • core optimization
  • degeneracy checking
  • chemical matching
  • critical clustering.

Version 3.5.54 - circa 1998 - 2010

Authors: David M. Lorber and Brian K. Shoichet

From 1994-2002, DOCK 3.5 was developed in the group of Brian Shoichet, first at the University of Oregon, then at Northwestern University Medical School, then at UCSF. The major author of the changes was David Lorber. BinQing Wei and other lab members also contributed.

The main innovations were:

  • ligands represented as a hierarchical ensemble of pre-computed conformations
  • ligand charges and desolvation energies computed with AMSOL
  • electostatic scoring grid computed with Delphi.

DOCK 3.5.54 remained largely unchanged except for minor bugfixes from 2003-2008.

  • Brian Shoichet wrote solvmap.
  • Michael Mysinger wrote code to read gzipped database files.
  • Niu Huang wrote flexible water handling code, based on BinQing Wei's flexible receptor code. This has not been incorporated into the germ line version of DOCK 3.

Version 3.6 - 2010 to 2015

Authors: Michael Mysinger, Michael Carchia, Ryan Coleman, Brian Shoichet

From 2008-2010, a number of changes were made and rolled up in a release called DOCK 3.6 released in May 2010.

  • MMM improved handling of ligand desolvation maps
  • Mike Carchia speeded up the already fast code from 3X to 5X using compiler optimizations, data duplication, and other techniques.
  • Ryan Coleman improved the handling of ligand hierarchies by improving the algorithms in mol2db and dock for sampling ligands. Additional work was put into clash-checking ligands, checking to make sure the ligands do not fall outside of the grid boundaries, and general code cleanup.

There was overlap between 3.6 and 3.7. They were both used in 2013, 2014, 2015.

Version 3.7 - 2013 to 2021

Major author: Ryan Coleman Paper: coleman et al.

DOCK 3.7 was the main version in the DOCK 3 series of docking programs for 8 years. It was released in 2013.

The current status of the DOCK 3.X series is that DOCK 3.7 is used in production and DOCK 3.8 will start taking over in 2021. DOCK 3.7 will eventually be retired. At the moment (Jan 2021), both versions are in use in the lab, but DOCK 3.7 is favored. That should reverse by May. DOCK 3.7 works with ZINC15/20. It will read ZINC22 but cannot do the strain calculation.

Version 3.8 - 2021 to present

Major contributors: Jiankun Lyu, Trent Balius, Ben Tingle

DOCK 3.8 is the current development branch in the lab. It builds on DOCK 3.7 and contains three innovations:

  • an internal strain energy calculation is now built into the database ZINC22 and used in the DOCK score
  • many new scripts have been added or extended to simplify the process
  • DOCK 3.8 can be interrupted and restarted, allowing it to be used more efficiently on AWS and shared clusters.
  • DOCK 3.8 can read both ZINC22 and ZINC15/20 databases.

DOCK 4 Series

Version 4.0 - circa 1997 to present

Authors: Todd Ewing, Irwin Kuntz

Version 4.0 was a major rewrite and update of DOCK in the C language. A new matching engine was developed which is more robust, efficient, and easier to use (Ewing, et al. J. Comput. Chem. 1997). Orientational sampling can now be controlled directly by specifying the number of desired orientations. Additional features include chemical scoring, chemical screening, and ligand flexibility.

If anyone knows about 4.0.1 and 4.0.2 are, please contribute. Current version available from the website is 4.0.2.

DOCK 5 and 6 Series

These programs are re-written, largely in C++

Version 5.0-5.4 - circa 2000 - 2006

Authors: Demetri Moustakas, P. Therese Lang, Scott Pegg, Scott Brozell, Irwin Kuntz

Version 5 was rewritten in C++ in a modular format, which allows for easy implementation of new scoring functions, sampling methods and analysis tools (Moustakas, et al, 2006). Additional new features include MPI parallelization, exhaustive orientation searching, improved conformation searching, GB/SA solvation scoring, and post-screening pose clustering. (Zou, et al. J. Am. Chem. Soc., 1999). DOCK 5 is entirely superseded by DOCK 6.

Version 6.0 - circa 2006 - 2007

DOCK 6 is an extension of the DOCK 5 code base. It includes the implementation of Hawkins-Cramer-Truhlar GB/SA solvation scoring with salt screening and PB/SA solvation scoring through OpenEye's Zap Library. Additional flexibility has been added to scoring options during minimization. The new code also incorporates DOCK 3.5.54 scoring features like Delphi electrostatics, ligand desolvation, and receptor desolvation. Finally, DOCK 6 introduces new code that allows access to the NAB library of functions such as receptor flexibility, the full AMBER molecular mechanics scoring function with implicit solvent, conjugate gradient minimization, and molecular dynamics simulation capabilities.

Versions 6.1, 6.2, 6.3 were released, but they all had significant liabilities.

Version 6.4 - circa 2010 - 2011

DOCK 6.4 was released in May 2010 and fixed many significant bugs. It also introduced numerous new features. This is currently the latest version of DOCK 6. The full release notes are on the DOCK website: [1].

The main points are: Internal Energy, Growth Tree and Statistics, Database Filter, Pre-minimization, Restrained Minimization, Miscellaneous, Bug Fixes, Deprecated Features.

Version 6.5 - circa 2011 - 2012

  • new anchor controls
  • new scoring functions
  • AMBER score improvements
  • PB/SA score improvements

Version 6.6 - circa 2012 - 2014

  • new scoring function
  • orienting improvements
  • Hungarian RMSD

Version 6.7 - circa 2014 - 2017

  • new input parameter default values
  • incremental changes

Version 6.8 - circa 2017 - 2019

  • new scoring functions: pharmacophore and descriptor score
  • internal energy scoring function

Version 6.9 - circa 2020 - present

  • supports de novo design ligand building
  • supports library generation
  • supports multi-grid footprint
  • amber_score as secondary score currently deprecated

DOCK 6.9 is under active development in the lab of Robert Rizzo at SUNY Stony Brook with help from investigators at other sites.