http://wiki.docking.org/api.php?action=feedcontributions&user=Rgc&feedformat=atomDISI - User contributions [en]2024-03-29T11:07:30ZUser contributionsMediaWiki 1.39.1http://wiki.docking.org/index.php?title=Pka&diff=7895Pka2014-06-05T10:48:59Z<p>Rgc: </p>
<hr />
<div>'''Experimental''' pKa values<br />
<br />
Please add any useful lists of experimentally determined pKa values you know of to the list:<br />
<br />
http://www.chem.wisc.edu/areas/reich/pkatable/index.htm[http://www.chem.wisc.edu/areas/reich/pkatable/index.htm]<br />
<br />
http://pubs.acs.org/doi/suppl/10.1021/ci100019p[http://pubs.acs.org/doi/suppl/10.1021/ci100019p]<br />
<br />
http://evans.harvard.edu/pdf/evans_pKa_table.pdf[http://evans.harvard.edu/pdf/evans_pKa_table.pdf]<br />
<br />
http://drugmet.rilspace.org/wiki/All_pKa_values[http://drugmet.rilspace.org/wiki/All_pKa_values]<br />
<br />
'''Computational''' pKa programs<br />
<br />
http://ibmlc2.chem.uga.edu/sparc/index.cfm[http://ibmlc2.chem.uga.edu/sparc/index.cfm] - can do multiple things, so you have to select the "pKa" button first, then draw molecule/insert smiles, and finally press "calculate". Has also a database of known values ("Search DB"). <br />
<br />
ChemAxon's[http://www.chemaxon.com/] Marvin/Calculator plugins has a pKa calculation option that looks quite elaborate. You can can try it online. Can't say how good it it is yet. Academics can apply for a free academic license. Lab copy is in /raid3/software/jchem/current/bin/mview. You have to cd to that directory then run it.<br />
<br />
Note: With Marvin's pKa tool, It is sometimes helpful to change the "min basic pKa" to a higher value like -2 instead of the default -10 so you can see all the protonation states.<br />
<br />
It is now included in the new dockenv, therefore you can just type mview in your terminal. <br />
<br />
Add your favorites.<br />
<br />
[[Category:Software]]</div>Rgchttp://wiki.docking.org/index.php?title=Pka&diff=7894Pka2014-06-05T10:48:45Z<p>Rgc: </p>
<hr />
<div>'''Experimental''' pKa values<br />
<br />
Please add any useful lists of experimentally determined pKa values you know of to the list:<br />
<br />
http://www.chem.wisc.edu/areas/reich/pkatable/index.htm[http://www.chem.wisc.edu/areas/reich/pkatable/index.htm]<br />
<br />
http://pubs.acs.org/doi/suppl/10.1021/ci100019p[http://pubs.acs.org/doi/suppl/10.1021/ci100019p]<br />
<br />
http://evans.harvard.edu/pdf/evans_pKa_table.pdf[http://evans.harvard.edu/pdf/evans_pKa_table.pdf]<br />
<br />
http://drugmet.rilspace.org/wiki/All_pKa_values[http://drugmet.rilspace.org/wiki/All_pKa_values]<br />
<br />
http://drugmet.rilspace.org/wiki/All_pKa_values[http://drugmet.rilspace.org/wiki/All_pKa_values]<br />
<br />
'''Computational''' pKa programs<br />
<br />
http://ibmlc2.chem.uga.edu/sparc/index.cfm[http://ibmlc2.chem.uga.edu/sparc/index.cfm] - can do multiple things, so you have to select the "pKa" button first, then draw molecule/insert smiles, and finally press "calculate". Has also a database of known values ("Search DB"). <br />
<br />
ChemAxon's[http://www.chemaxon.com/] Marvin/Calculator plugins has a pKa calculation option that looks quite elaborate. You can can try it online. Can't say how good it it is yet. Academics can apply for a free academic license. Lab copy is in /raid3/software/jchem/current/bin/mview. You have to cd to that directory then run it.<br />
<br />
Note: With Marvin's pKa tool, It is sometimes helpful to change the "min basic pKa" to a higher value like -2 instead of the default -10 so you can see all the protonation states.<br />
<br />
It is now included in the new dockenv, therefore you can just type mview in your terminal. <br />
<br />
Add your favorites.<br />
<br />
[[Category:Software]]</div>Rgchttp://wiki.docking.org/index.php?title=Multimol2db2.py&diff=5711Multimol2db2.py2014-02-07T21:14:06Z<p>Rgc: Created page with "'''multimol2db2.py''' This script is a utility program that takes as input a .mol2 file that has been protonated and all conformations have been generated with OMEGA (or alte..."</p>
<hr />
<div>'''multimol2db2.py'''<br />
<br />
This script is a utility program that takes as input a .mol2 file that has been protonated and all conformations have been generated with OMEGA (or alternatively, a mol2 file from some other source) and runs AMSOL & mol2db2 on it to make .db2 files for docking.<br />
<br />
multimol2db2.py input.mol2<br />
<br />
It is very important that the beginning of your .mol2 file contains this kind of header:<br />
<br />
@<TRIPOS>MOLECULE<br />
TEMP12345678<br />
70 72 0 0 0<br />
SMALL<br />
NO_CHARGES<br />
<br />
@<TRIPOS>ATOM<br />
1 C1 5.1180 4.5740 2.9690 C.3 1 UNK1 0.0182<br />
2 N1 4.4470 5.0610 4.2130 N.4 1 UNK1 -0.5553<br />
<br />
Otherwise AMSOL and the associated scripts that run it will crash. The most important part is the second line that needs to be in the form XXXX00000000<br />
<br />
Other tips:<br />
<br />
1. Don't have any extra lines except the MOLECULE, ATOM and BOND records<br />
<br />
2. The last 3 columns of the atom record are important (the 1 UNK1 0.0182), some .mol2 files don't have them so just add dummy (1 UNK1 0.0000) to yours.<br />
<br />
The file is in your dockenv/src/mol2db2/ or $DOCK_BASE/src/mol2db2<br />
<br />
A version is also kept is ~/Source/rgc_src/multimol2db2.py<br />
<br />
[[User:Rgc]]</div>Rgchttp://wiki.docking.org/index.php?title=Dock3.7&diff=5614Dock3.72013-10-22T21:11:35Z<p>Rgc: </p>
<hr />
<div>==DOCK3.7==<br />
<br />
DOCK3.7 is a new version of DOCK, with new accessory tools for protein & ligand preparation as well. The website for download will eventually be: http://dock.compbio.ucsf.edu/DOCK3.7/<br />
<br />
The paper citation is [http://www.plosone.org/article/info:doi/10.1371/journal.pone.0075992 Coleman PLOS ONE 2013]<br />
<br />
The citation for flexible docking with DOCK3.7 will be Fischer, Coleman, Fraser & Shoichet 2013, again this will be updated upon acceptance.<br />
<br />
''Ligand Preparation''<br />
<br />
Ligand preparation has been modified to use mol2db2 instead of mol2db for database generation. Many other features have also been integrated. To build a set of ligands from SMILES on the cluster, use:<br />
<br />
db2start.e.csh input.smi ref<br />
<br />
Or to build on a standalone machine, use <br />
<br />
db2gen.e.csh input.smi ref<br />
<br />
Note that many programs must be properly installed and available or this script will fail. The most troublesome is EPIK. For this reason, among others, Dahlia Weiss has helped get Marvin's Chemaxon cxcalc running in lieu of EPIK. This is probably the preferred way to build molecules. Run it on the cluster with:<br />
<br />
db2start.e.cxcalc.csh input.smi<br />
<br />
The format of the input file here is a two column file with one column being a SMILES string and the other column being an ID. Any length IDs are valid, but only 16 characters will get carried into the DOCKing phase of the operation, see [[Mol2db2_Format_2]] for more details.<br />
<br />
Once the jobs have finished, you can run <br />
<br />
db2end-prefix.py name<br />
<br />
To build dockable name-XXXXXX.db2.gz files.<br />
<br />
''Protein Target Preparation''<br />
<br />
be-blasti is still the preferred method of downloading PDB files and splitting them into rec.pdb and xtal-lig.pdb files. Run it with<br />
<br />
be_blasti.csh filename<br />
<br />
The filename should be a file that contains PDB codes you want to download.<br />
<br />
Once you get a rec.pdb (representing the protein) and xtal-lig.pdb (representing the ligand or a set of atoms in the binding site of the protein) (these names can be changed as well, see options), you can run blastermaster.py, the new version of DOCK Blaster. Try running the help to see the extensive options:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -h<br />
<br />
A typical way of running it is to just run it as:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -v<br />
<br />
-v gives you verbose output, which can be helpful if something goes wrong. If everything is successful, you'll see this at the end of the file:<br />
<br />
copying matching_spheres.sph into dockfiles<br />
copying trim.electrostatics.phi into dockfiles<br />
copying ligand.desolv.hydrogen into dockfiles<br />
copying ligand.desolv.heavy into dockfiles<br />
copying vdw.bmp into dockfiles<br />
copying vdw.vdw into dockfiles<br />
copying vdw.parms.amb.mindock into dockfiles<br />
writing INDOCK file: INDOCK<br />
<br />
Otherwise something went wrong. Notice that you have an INDOCK file written for you, with many defaults set that you may want to change. Also, old INDOCK files are slightly incompatible with the new files, so you should consult the changed version written for you, or take a finer look at the page on the [[DOCK3.7 INDOCK]] file.<br />
<br />
''Running DOCK''<br />
<br />
Setting up an alias for the dock37tools directory in $DOCK_BASE/src/dock37tools/ is highly recommended, though you don't need it if you don't want it. Assume you have it set as $d37 from here on out with a command line this<br />
<br />
setenv d37 $DOCK_BASE/src/dock37tools/<br />
<br />
In that directory, there are 3 scripts for setting up a dock run. The first script will send setup a docking run where each file will be relegated to a separate node, fine for quick jobs.<br />
<br />
$d37/setup_db2.csh /full/path/to/db2/files/<br />
<br />
Another script, useful for testing on DUD-E targets, is:<br />
<br />
$d37/setup_db2_own.csh /full/path/to/db2/files/<br />
<br />
A file script, useful for running many db2.gz files, sometimes with a few files grouped together, like for prospective screening against lead-like is:<br />
<br />
$d37/setup_db2_lots.py desiredDirectoryCount prefixName /full/path/to/db2/files/<br />
<br />
After setting up any of these runs, you can run them with the following:<br />
<br />
$d37/submit.csh<br />
<br />
or $d37/subdock.csh /path/to/dock.csh if you have compiled your own version of DOCK3.7.<br />
<br />
Runs should proceed on the cluster. Problems will show up in the stderr files, further diagnosis can be attempted by looking at the various OUTDOCK files produced. If you've used lots of sampling, expect slower results. If you've asked for hundreds of poses, expect large files. You should not combine prospective screening, hundreds of poses and high sampling.<br />
<br />
If a few jobs crash (shouldn't happen but anything is possible) and you need to complete them, run<br />
<br />
$d37/restart.py -f<br />
<br />
''Analyzing DOCK results''<br />
<br />
The dock37tools in $d37 contain various analysis programs. Once your jobs are done, you can run:<br />
<br />
$d37/extract_all.py<br />
<br />
This may take awhile but it will pull all your results into a single file, etc. If you want to calculate enrichment, etc.:<br />
<br />
$d37/enrich.py -l ligand-file -d decoy-file <br />
<br />
Where the ligand-file and decoy-file are single column files with the ligand and decoy IDs on individual lines. Plotting is also possible<br />
<br />
$d37/plots.py -i . -l label --l=ligand-file -d decoy-file <br />
<br />
Common usage is to plot several different runs on a single plot like so:<br />
<br />
$d37/plots.py -i run1 -l label1 -i run2 -l label2 --l=ligand-file -d decoy-file <br />
<br />
If you want to compare the scores from two runs, try:<br />
<br />
$d37/two_run_plot.py run1 run2<br />
<br />
Of course, the plots must be run on a machine with the proper libraries installed, like sgehead.<br />
<br />
Another common use is to look at top poses in the ViewDock module of UCSF Chimera or with PyMOL. You can make a mol2 output file that can be read by these programs with the following command:<br />
<br />
$d37/getposes.py<br />
<br />
The defaults on this script are to make a poses.mol2 file with the top 500 poses from the entire run, with a single pose per molecule ID. There are many options which can be seen with the "-h" flag. A more complex example is:<br />
<br />
$d37/getposes.py -z -l 1000 -x 2 -f ligands.txt -o ligands.1000.mol2<br />
<br />
In order, the '-z' flag connects to ZINC for vendor information, the "-l 1000" flag only gets the first 1000 ligands in the file, '-x 2' gets the top 2 poses, the '-f ligands.txt' file designates the ligand file to use and '-o ligands.1000.mol2' designates the output filename.</div>Rgchttp://wiki.docking.org/index.php?title=Pka&diff=5541Pka2013-06-27T17:16:10Z<p>Rgc: </p>
<hr />
<div>'''Experimental''' pKa values<br />
<br />
Please add any useful lists of experimentally determined pKa values you know of to the list:<br />
<br />
http://www.chem.wisc.edu/areas/reich/pkatable/index.htm[http://www.chem.wisc.edu/areas/reich/pkatable/index.htm]<br />
<br />
http://pubs.acs.org/doi/suppl/10.1021/ci100019p[http://pubs.acs.org/doi/suppl/10.1021/ci100019p]<br />
<br />
http://evans.harvard.edu/pdf/evans_pKa_table.pdf[http://evans.harvard.edu/pdf/evans_pKa_table.pdf]<br />
<br />
http://drugmet.rilspace.org/wiki/All_pKa_values[http://drugmet.rilspace.org/wiki/All_pKa_values]<br />
<br />
'''Computational''' pKa programs<br />
<br />
http://ibmlc2.chem.uga.edu/sparc/index.cfm[http://ibmlc2.chem.uga.edu/sparc/index.cfm] - can do multiple things, so you have to select the "pKa" button first, then draw molecule/insert smiles, and finally press "calculate". Has also a database of known values ("Search DB"). <br />
<br />
ChemAxon's[http://www.chemaxon.com/] Marvin/Calculator plugins has a pKa calculation option that looks quite elaborate. You can can try it online. Can't say how good it it is yet. Academics can apply for a free academic license. Lab copy is in /raid3/software/jchem/current/bin/mview. You have to cd to that directory then run it.<br />
<br />
Note: With Marvin's pKa tool, It is sometimes helpful to change the "min basic pKa" to a higher value like -2 instead of the default -10 so you can see all the protonation states.<br />
<br />
It is now included in the new dockenv, therefore you can just type mview in your terminal. <br />
<br />
Add your favorites.</div>Rgchttp://wiki.docking.org/index.php?title=Chemdraw_figure_preparation&diff=5540Chemdraw figure preparation2013-06-26T18:33:00Z<p>Rgc: Created page with "Chemdraw figures. First, kekulize your SMILES (remove aromatic groups). convert.py --i=prospective_tries.smiles --o=temp.smi --smioKekule Now, copy/paste each SMILES into ..."</p>
<hr />
<div>Chemdraw figures.<br />
<br />
First, kekulize your SMILES (remove aromatic groups).<br />
<br />
convert.py --i=prospective_tries.smiles --o=temp.smi --smioKekule<br />
<br />
Now, copy/paste each SMILES into chemdraw (or, if chemdraw doesn't let you, marvin). If using marvin, then save as a chemdraw file, reopen with chemdraw, copy/paste into the journal template, save again and save a PDF to put into the paper text as well.<br />
<br />
Fun stuff.</div>Rgchttp://wiki.docking.org/index.php?title=Dock3.7&diff=5399Dock3.72013-04-04T20:37:13Z<p>Rgc: </p>
<hr />
<div>==DOCK3.7==<br />
<br />
DOCK3.7 is a new version of DOCK, with new accessory tools for protein & ligand preparation as well. The website for download will eventually be: http://dock.compbio.ucsf.edu/DOCK3.7/<br />
<br />
The paper citation will be Coleman 2013 once accepted we will update this page.<br />
<br />
The citation for flexible docking with DOCK3.7 will be Fischer, Coleman, Fraser & Shoichet 2013, again this will be updated upon acceptance.<br />
<br />
''Ligand Preparation''<br />
<br />
Ligand preparation has been modified to use mol2db2 instead of mol2db for database generation. Many other features have also been integrated. To build a set of ligands from SMILES on the cluster, use:<br />
<br />
db2start.e.csh input.smi ref<br />
<br />
Or to build on a standalone machine, use <br />
<br />
db2gen.e.csh input.smi ref<br />
<br />
Note that many programs must be properly installed and available or this script will fail. The most troublesome is EPIK. For this reason, among others, Dahlia Weiss has helped get Marvin's Chemaxon cxcalc running in lieu of EPIK. This is probably the preferred way to build molecules. Run it on the cluster with:<br />
<br />
db2start.e.cxcalc.csh input.smi<br />
<br />
The format of the input file here is a two column file with one column being a SMILES string and the other column being an ID. Any length IDs are valid, but only 16 characters will get carried into the DOCKing phase of the operation, see [[Mol2db2_Format_2]] for more details.<br />
<br />
Once the jobs have finished, you can run <br />
<br />
db2end-prefix.py name<br />
<br />
To build dockable name-XXXXXX.db2.gz files.<br />
<br />
''Protein Target Preparation''<br />
<br />
be-blasti is still the preferred method of downloading PDB files and splitting them into rec.pdb and xtal-lig.pdb files. Run it with<br />
<br />
be_blasti.csh filename<br />
<br />
The filename should be a file that contains PDB codes you want to download.<br />
<br />
Once you get a rec.pdb (representing the protein) and xtal-lig.pdb (representing the ligand or a set of atoms in the binding site of the protein) (these names can be changed as well, see options), you can run blastermaster.py, the new version of DOCK Blaster. Try running the help to see the extensive options:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -h<br />
<br />
A typical way of running it is to just run it as:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -v<br />
<br />
-v gives you verbose output, which can be helpful if something goes wrong. If everything is successful, you'll see this at the end of the file:<br />
<br />
copying matching_spheres.sph into dockfiles<br />
copying trim.electrostatics.phi into dockfiles<br />
copying ligand.desolv.hydrogen into dockfiles<br />
copying ligand.desolv.heavy into dockfiles<br />
copying vdw.bmp into dockfiles<br />
copying vdw.vdw into dockfiles<br />
copying vdw.parms.amb.mindock into dockfiles<br />
writing INDOCK file: INDOCK<br />
<br />
Otherwise something went wrong. Notice that you have an INDOCK file written for you, with many defaults set that you may want to change. Also, old INDOCK files are slightly incompatible with the new files, so you should consult the changed version written for you, or take a finer look at the page on the [[DOCK3.7 INDOCK]] file.<br />
<br />
''Running DOCK''<br />
<br />
Setting up an alias for the dock37tools directory in $DOCK_BASE/src/dock37tools/ is highly recommended, though you don't need it if you don't want it. Assume you have it set as $d37 from here on out with a command line this<br />
<br />
setenv d37 $DOCK_BASE/src/dock37tools/<br />
<br />
In that directory, there are 3 scripts for setting up a dock run. The first script will send setup a docking run where each file will be relegated to a separate node, fine for quick jobs.<br />
<br />
$d37/setup_db2.csh /full/path/to/db2/files/<br />
<br />
Another script, useful for testing on DUD-E targets, is:<br />
<br />
$d37/setup_db2_own.csh /full/path/to/db2/files/<br />
<br />
A file script, useful for running many db2.gz files, sometimes with a few files grouped together, like for prospective screening against lead-like is:<br />
<br />
$d37/setup_db2_lots.py desiredDirectoryCount prefixName /full/path/to/db2/files/<br />
<br />
After setting up any of these runs, you can run them with the following:<br />
<br />
$d37/submit.csh<br />
<br />
or $d37/subdock.csh /path/to/dock.csh if you have compiled your own version of DOCK3.7.<br />
<br />
Runs should proceed on the cluster. Problems will show up in the stderr files, further diagnosis can be attempted by looking at the various OUTDOCK files produced. If you've used lots of sampling, expect slower results. If you've asked for hundreds of poses, expect large files. You should not combine prospective screening, hundreds of poses and high sampling.<br />
<br />
If a few jobs crash (shouldn't happen but anything is possible) and you need to complete them, run<br />
<br />
$d37/restart.py -f<br />
<br />
''Analyzing DOCK results''<br />
<br />
The dock37tools in $d37 contain various analysis programs. Once your jobs are done, you can run:<br />
<br />
$d37/extract_all.py<br />
<br />
This may take awhile but it will pull all your results into a single file, etc. If you want to calculate enrichment, etc.:<br />
<br />
$d37/enrich.py -l ligand-file -d decoy-file <br />
<br />
Where the ligand-file and decoy-file are single column files with the ligand and decoy IDs on individual lines. Plotting is also possible<br />
<br />
$d37/plots.py -i . -l label --l=ligand-file -d decoy-file <br />
<br />
Common usage is to plot several different runs on a single plot like so:<br />
<br />
$d37/plots.py -i run1 -l label1 -i run2 -l label2 --l=ligand-file -d decoy-file <br />
<br />
If you want to compare the scores from two runs, try:<br />
<br />
$d37/two_run_plot.py run1 run2<br />
<br />
Of course, the plots must be run on a machine with the proper libraries installed, like sgehead.<br />
<br />
Another common use is to look at top poses in the ViewDock module of UCSF Chimera or with PyMOL. You can make a mol2 output file that can be read by these programs with the following command:<br />
<br />
$d37/getposes.py<br />
<br />
The defaults on this script are to make a poses.mol2 file with the top 500 poses from the entire run, with a single pose per molecule ID. There are many options which can be seen with the "-h" flag. A more complex example is:<br />
<br />
$d37/getposes.py -z -l 1000 -x 2 -f ligands.txt -o ligands.1000.mol2<br />
<br />
In order, the '-z' flag connects to ZINC for vendor information, the "-l 1000" flag only gets the first 1000 ligands in the file, '-x 2' gets the top 2 poses, the '-f ligands.txt' file designates the ligand file to use and '-o ligands.1000.mol2' designates the output filename.</div>Rgchttp://wiki.docking.org/index.php?title=DOCK3.7_INDOCK&diff=5397DOCK3.7 INDOCK2013-03-28T04:42:51Z<p>Rgc: </p>
<hr />
<div>This page describes the updates to the [[DOCK3.7]] INDOCK file, alongside a sample file. blastermaster.py described on the [[Dock3.7]] page writes an INDOCK file for you, which you can modify.<br />
<br />
DOCK 3.7 parameter<br />
##################################################### <br />
# NOTE: split_database_index is reserved to specify a list of files<br />
ligand_atom_file split_database_index #standard for docking large databases<br />
#####################################################<br />
# OUTPUT<br />
output_file_prefix test. #default, but can be changed<br />
#####################################################<br />
# MATCHING<br />
match_method 2 #1 matches up to the distance_tolerance below, ignoring match_goal, step, maximum, etc. <br />
#2 uses the adaptive sampling that attempts to get a number of match_goal orientational samples <br />
distance_tolerance 0.05 #starting distance tolerance<br />
match_goal 5000 #desired number of orientational samples to get before quitting under match_method = 2<br />
distance_step 0.05 #increment from distance_tolerance until max or match_goal is reached<br />
distance_maximum 0.5 #biggest tolerance that will be used to attempt to get match_goal orientational samples<br />
timeout 10.0 #number of seconds before quitting on any given ligand<br />
nodes_maximum 4 #max number of points for which all distances must be within the tolerance. 3 possible, 4 suggested.<br />
nodes_minimum 4 #min number of points for which all distances must be within the tolerance. 4 suggested, 3 possible.<br />
bump_maximum 50.0 #van der Waals score in kcal/mol for any part of the molecule to get before further examination stopped<br />
bump_rigid 50.0 #van der Waals score in kcal/mol for the rigid component of the ligand molecule, if above, discarded<br />
#####################################################<br />
# COLORING<br />
chemical_matching no #default to off, can use chemical matching from DOCK3.6 if desired<br />
case_sensitive no #case sensitivity for chemical matching groups<br />
#####################################################<br />
# SEARCH MODE<br />
atom_minimum 4 #minimum number of atoms in ligand for it to be scored<br />
atom_maximum 100 #maximum number of atoms in ligand for it to be scored<br />
number_save 1 #how many poses to save. Any number of poses can be saved, but disk space is a factor!<br />
molecules_maximum 100000 #how many molecules will be searched before quitting. raise if databases are large.<br />
#####################################################<br />
# SCORING<br />
ligand_desolvation volume #use GB desolvation scoring, other options are full or none<br />
vdw_maximum 1.0e10 #maximum vdw score possible, prevents overflow<br />
electrostatic_scale 1.0 #scaling factors to be applied to scores, likely not to be trifled with<br />
vdw_scale 1.0 #again, scales the entire vdw score.<br />
internal_scale 0.0 #scales an internal focusing term. set this to 0 as this doesn't work at all/isn't implemented<br />
per_atom_scores no #change to yes if per-atom scoring breakdowns desired. note that this doubles output size.<br />
#####################################################<br />
# INPUT FILES / THINGS THAT CHANGE<br />
receptor_sphere_file ../dockfiles/matching_spheres.sph #receptor matching spheres file following age old SPH format<br />
vdw_parameter_file ../dockfiles/vdw.parms.amb.mindock #vdw parameter file.<br />
flexible_receptor no #describing only single receptor file for now<br />
total_receptors 1 <br />
############## grids/data for one receptor<br />
rec_number 1<br />
rec_group 1<br />
rec_group_option 1<br />
solvmap_file ../dockfiles/ligand.desolv.heavy #GB-based desolvation maps <br />
hydrogen_solvmap_file ../dockfiles/ligand.desolv.hydrogen<br />
delphi_file ../dockfiles/trim.electrostatics.phi #electrostatics map, size must be declared with delphi_nsize below<br />
chemgrid_file ../dockfiles/vdw.vdw #vdw grid file, contains vdw scores<br />
bumpmap_file ../dockfiles/vdw.bmp #vdw bump file, only used for header data for chemgrid_file <br />
############## end of INDOCK<br />
delphi_nsize 47 #size of electrostatics grid (cubic). blastermaster.py trims to the minimum size necessary to save memory.</div>Rgchttp://wiki.docking.org/index.php?title=Dock3.7&diff=5394Dock3.72013-03-27T17:13:52Z<p>Rgc: </p>
<hr />
<div>==DOCK3.7==<br />
<br />
DOCK3.7 is a new version of DOCK, with new accessory tools for protein & ligand preparation as well. The website for download will eventually be: http://dock.compbio.ucsf.edu/DOCK3.7/<br />
<br />
The paper citation will be Coleman, Irwin & Shoichet 2013 once accepted we will update this page.<br />
<br />
The citation for flexible docking with DOCK3.7 will be Fischer, Coleman, Fraser & Shoichet 2013, again this will be updated upon acceptance.<br />
<br />
''Ligand Preparation''<br />
<br />
Ligand preparation has been modified to use mol2db2 instead of mol2db for database generation. Many other features have also been integrated. To build a set of ligands from SMILES on the cluster, use:<br />
<br />
db2start.e.csh input.smi ref<br />
<br />
Or to build on a standalone machine, use <br />
<br />
db2gen.e.csh input.smi ref<br />
<br />
Note that many programs must be properly installed and available or this script will fail. The most troublesome is EPIK. For this reason, among others, Dahlia Weiss has helped get Marvin's Chemaxon cxcalc running in lieu of EPIK. This is probably the preferred way to build molecules. Run it on the cluster with:<br />
<br />
db2start.e.cxcalc.csh input.smi<br />
<br />
The format of the input file here is a two column file with one column being a SMILES string and the other column being an ID. Any length IDs are valid, but only 16 characters will get carried into the DOCKing phase of the operation, see [[Mol2db2_Format_2]] for more details.<br />
<br />
Once the jobs have finished, you can run <br />
<br />
db2end-prefix.py name<br />
<br />
To build dockable name-XXXXXX.db2.gz files.<br />
<br />
''Protein Target Preparation''<br />
<br />
be-blasti is still the preferred method of downloading PDB files and splitting them into rec.pdb and xtal-lig.pdb files. Run it with<br />
<br />
be_blasti.csh filename<br />
<br />
The filename should be a file that contains PDB codes you want to download.<br />
<br />
Once you get a rec.pdb (representing the protein) and xtal-lig.pdb (representing the ligand or a set of atoms in the binding site of the protein) (these names can be changed as well, see options), you can run blastermaster.py, the new version of DOCK Blaster. Try running the help to see the extensive options:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -h<br />
<br />
A typical way of running it is to just run it as:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -v<br />
<br />
-v gives you verbose output, which can be helpful if something goes wrong. If everything is successful, you'll see this at the end of the file:<br />
<br />
copying matching_spheres.sph into dockfiles<br />
copying trim.electrostatics.phi into dockfiles<br />
copying ligand.desolv.hydrogen into dockfiles<br />
copying ligand.desolv.heavy into dockfiles<br />
copying vdw.bmp into dockfiles<br />
copying vdw.vdw into dockfiles<br />
copying vdw.parms.amb.mindock into dockfiles<br />
writing INDOCK file: INDOCK<br />
<br />
Otherwise something went wrong. Notice that you have an INDOCK file written for you, with many defaults set that you may want to change. Also, old INDOCK files are slightly incompatible with the new files, so you should consult the changed version written for you, or take a finer look at the page on the [[DOCK3.7 INDOCK]] file.<br />
<br />
''Running DOCK''<br />
<br />
Setting up an alias for the dock37tools directory in $DOCK_BASE/src/dock37tools/ is highly recommended, though you don't need it if you don't want it. Assume you have it set as $d37 from here on out with a command line this<br />
<br />
setenv d37 $DOCK_BASE/src/dock37tools/<br />
<br />
In that directory, there are 3 scripts for setting up a dock run. The first script will send setup a docking run where each file will be relegated to a separate node, fine for quick jobs.<br />
<br />
$d37/setup_db2.csh /full/path/to/db2/files/<br />
<br />
Another script, useful for testing on DUD-E targets, is:<br />
<br />
$d37/setup_db2_own.csh /full/path/to/db2/files/<br />
<br />
A file script, useful for running many db2.gz files, sometimes with a few files grouped together, like for prospective screening against lead-like is:<br />
<br />
$d37/setup_db2_lots.py desiredDirectoryCount prefixName /full/path/to/db2/files/<br />
<br />
After setting up any of these runs, you can run them with the following:<br />
<br />
$d37/submit.csh<br />
<br />
or $d37/subdock.csh /path/to/dock.csh if you have compiled your own version of DOCK3.7.<br />
<br />
Runs should proceed on the cluster. Problems will show up in the stderr files, further diagnosis can be attempted by looking at the various OUTDOCK files produced. If you've used lots of sampling, expect slower results. If you've asked for hundreds of poses, expect large files. You should not combine prospective screening, hundreds of poses and high sampling.<br />
<br />
If a few jobs crash (shouldn't happen but anything is possible) and you need to complete them, run<br />
<br />
$d37/restart.py -f<br />
<br />
''Analyzing DOCK results''<br />
<br />
The dock37tools in $d37 contain various analysis programs. Once your jobs are done, you can run:<br />
<br />
$d37/extract_all.py<br />
<br />
This may take awhile but it will pull all your results into a single file, etc. If you want to calculate enrichment, etc.:<br />
<br />
$d37/enrich.py -l ligand-file -d decoy-file <br />
<br />
Where the ligand-file and decoy-file are single column files with the ligand and decoy IDs on individual lines. Plotting is also possible<br />
<br />
$d37/plots.py -i . -l label --l=ligand-file -d decoy-file <br />
<br />
Common usage is to plot several different runs on a single plot like so:<br />
<br />
$d37/plots.py -i run1 -l label1 -i run2 -l label2 --l=ligand-file -d decoy-file <br />
<br />
If you want to compare the scores from two runs, try:<br />
<br />
$d37/two_run_plot.py run1 run2<br />
<br />
Of course, the plots must be run on a machine with the proper libraries installed, like sgehead.<br />
<br />
Another common use is to look at top poses in the ViewDock module of UCSF Chimera or with PyMOL. You can make a mol2 output file that can be read by these programs with the following command:<br />
<br />
$d37/getposes.py<br />
<br />
The defaults on this script are to make a poses.mol2 file with the top 500 poses from the entire run, with a single pose per molecule ID. There are many options which can be seen with the "-h" flag. A more complex example is:<br />
<br />
$d37/getposes.py -z -l 1000 -x 2 -f ligands.txt -o ligands.1000.mol2<br />
<br />
In order, the '-z' flag connects to ZINC for vendor information, the "-l 1000" flag only gets the first 1000 ligands in the file, '-x 2' gets the top 2 poses, the '-f ligands.txt' file designates the ligand file to use and '-o ligands.1000.mol2' designates the output filename.</div>Rgchttp://wiki.docking.org/index.php?title=Dock3.7&diff=5393Dock3.72013-03-27T17:12:39Z<p>Rgc: dock 3.7 user manual step 1</p>
<hr />
<div>==DOCK3.7==<br />
<br />
DOCK3.7 is a new version of DOCK, with new accessory tools for protein & ligand preparation as well. The website for download will eventually be: http://dock.compbio.ucsf.edu/DOCK3.7/<br />
<br />
The paper citation will be Coleman, Irwin & Shoichet 2013 once accepted we will update this page.<br />
<br />
The citation for flexible docking with DOCK3.7 will be Fischer, Coleman, Fraser & Shoichet 2013, again this will be updated upon acceptance.<br />
<br />
''Ligand Preparation''<br />
<br />
Ligand preparation has been modified to use mol2db2 instead of mol2db for database generation. Many other features have also been integrated. To build a set of ligands from SMILES on the cluster, use:<br />
<br />
db2start.e.csh input.smi ref<br />
<br />
Or to build on a standalone machine, use <br />
<br />
db2gen.e.csh input.smi ref<br />
<br />
Note that many programs must be properly installed and available or this script will fail. The most troublesome is EPIK. For this reason, among others, Dahlia Weiss has helped get Marvin's Chemaxon cxcalc running in lieu of EPIK. This is probably the preferred way to build molecules. Run it on the cluster with:<br />
<br />
db2start.e.cxcalc.csh input.smi<br />
<br />
The format of the input file here is a two column file with one column being a SMILES string and the other column being an ID. Any length IDs are valid, but only 16 characters will get carried into the DOCKing phase of the operation, see [[Mol2db2_Format_2]] for more details.<br />
<br />
Once the jobs have finished, you can run <br />
<br />
db2end-prefix.py name<br />
<br />
To build dockable name-XXXXXX.db2.gz files.<br />
<br />
''Protein Target Preparation''<br />
<br />
be-blasti is still the preferred method of downloading PDB files and splitting them into rec.pdb and xtal-lig.pdb files. Run it with<br />
<br />
be_blasti.csh filename<br />
<br />
The filename should be a file that contains PDB codes you want to download.<br />
<br />
Once you get a rec.pdb (representing the protein) and xtal-lig.pdb (representing the ligand or a set of atoms in the binding site of the protein) (these names can be changed as well, see options), you can run blastermaster.py, the new version of DOCK Blaster. Try running the help to see the extensive options:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -h<br />
<br />
A typical way of running it is to just run it as:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -v<br />
<br />
-v gives you verbose output, which can be helpful if something goes wrong. If everything is successful, you'll see this at the end of the file:<br />
<br />
copying matching_spheres.sph into dockfiles<br />
copying trim.electrostatics.phi into dockfiles<br />
copying ligand.desolv.hydrogen into dockfiles<br />
copying ligand.desolv.heavy into dockfiles<br />
copying vdw.bmp into dockfiles<br />
copying vdw.vdw into dockfiles<br />
copying vdw.parms.amb.mindock into dockfiles<br />
writing INDOCK file: INDOCK<br />
<br />
Otherwise something went wrong. Notice that you have an INDOCK file written for you, with many defaults set that you may want to change. Also, old INDOCK files are slightly incompatible with the new files, so you should consult the changed version written for you, or take a finer look at the page on the [[DOCK3.7 INDOCK]] file.<br />
<br />
"Running DOCK"<br />
<br />
Setting up an alias for the dock37tools directory in $DOCK_BASE/src/dock37tools/ is highly recommended, though you don't need it if you don't want it. Assume you have it set as $d37 from here on out with a command line this<br />
<br />
setenv d37 $DOCK_BASE/src/dock37tools/<br />
<br />
In that directory, there are 3 scripts for setting up a dock run. The first script will send setup a docking run where each file will be relegated to a separate node, fine for quick jobs.<br />
<br />
$d37/setup_db2.csh /full/path/to/db2/files/<br />
<br />
Another script, useful for testing on DUD-E targets, is:<br />
<br />
$d37/setup_db2_own.csh /full/path/to/db2/files/<br />
<br />
A file script, useful for running many db2.gz files, sometimes with a few files grouped together, like for prospective screening against lead-like is:<br />
<br />
$d37/setup_db2_lots.py desiredDirectoryCount prefixName /full/path/to/db2/files/<br />
<br />
After setting up any of these runs, you can run them with the following:<br />
<br />
$d37/submit.csh<br />
<br />
or $d37/subdock.csh /path/to/dock.csh if you have compiled your own version of DOCK3.7.<br />
<br />
Runs should proceed on the cluster. Problems will show up in the stderr files, further diagnosis can be attempted by looking at the various OUTDOCK files produced. If you've used lots of sampling, expect slower results. If you've asked for hundreds of poses, expect large files. You should not combine prospective screening, hundreds of poses and high sampling.<br />
<br />
If a few jobs crash (shouldn't happen but anything is possible) and you need to complete them, run<br />
<br />
$d37/restart.py -f<br />
<br />
"Analyzing DOCK results"<br />
<br />
The dock37tools in $d37 contain various analysis programs. Once your jobs are done, you can run:<br />
<br />
$d37/extract_all.py<br />
<br />
This may take awhile but it will pull all your results into a single file, etc. If you want to calculate enrichment, etc.:<br />
<br />
$d37/enrich.py -l ligand-file -d decoy-file <br />
<br />
Where the ligand-file and decoy-file are single column files with the ligand and decoy IDs on individual lines. Plotting is also possible<br />
<br />
$d37/plots.py -i . -l label --l=ligand-file -d decoy-file <br />
<br />
Common usage is to plot several different runs on a single plot like so:<br />
<br />
$d37/plots.py -i run1 -l label1 -i run2 -l label2 --l=ligand-file -d decoy-file <br />
<br />
If you want to compare the scores from two runs, try:<br />
<br />
$d37/two_run_plot.py run1 run2<br />
<br />
Of course, the plots must be run on a machine with the proper libraries installed, like sgehead.<br />
<br />
Another common use is to look at top poses in the ViewDock module of UCSF Chimera or with PyMOL. You can make a mol2 output file that can be read by these programs with the following command:<br />
<br />
$d37/getposes.py<br />
<br />
The defaults on this script are to make a poses.mol2 file with the top 500 poses from the entire run, with a single pose per molecule ID. There are many options which can be seen with the "-h" flag. A more complex example is:<br />
<br />
$d37/getposes.py -z -l 1000 -x 2 -f ligands.txt -o ligands.1000.mol2<br />
<br />
In order, the '-z' flag connects to ZINC for vendor information, the "-l 1000" flag only gets the first 1000 ligands in the file, '-x 2' gets the top 2 poses, the '-f ligands.txt' file designates the ligand file to use and '-o ligands.1000.mol2' designates the output filename.</div>Rgchttp://wiki.docking.org/index.php?title=Dock3.7&diff=5392Dock3.72013-03-27T16:40:10Z<p>Rgc: not done yet, will be DOCK3.7 manual</p>
<hr />
<div>==DOCK3.7==<br />
<br />
DOCK3.7 is a new version of DOCK, with new accessory tools for protein & ligand preparation as well. The website for download will eventually be: http://dock.compbio.ucsf.edu/DOCK3.7/<br />
<br />
The paper citation will be Coleman, Irwin & Shoichet 2013 once accepted we will update this page.<br />
<br />
The citation for flexible docking with DOCK3.7 will be Fischer, Coleman, Fraser & Shoichet 2013, again this will be updated upon acceptance.<br />
<br />
''Ligand Preparation''<br />
<br />
Ligand preparation has been modified to use mol2db2 instead of mol2db for database generation. Many other features have also been integrated. To build a set of ligands from SMILES on the cluster, use:<br />
<br />
db2start.e.csh input.smi ref<br />
<br />
Or to build on a standalone machine, use <br />
<br />
db2gen.e.csh input.smi ref<br />
<br />
Note that many programs must be properly installed and available or this script will fail. The most troublesome is EPIK. For this reason, among others, Dahlia Weiss has helped get Marvin's Chemaxon cxcalc running in lieu of EPIK. This is probably the preferred way to build molecules. Run it on the cluster with:<br />
<br />
db2start.e.cxcalc.csh input.smi<br />
<br />
The format of the input file here is a two column file with one column being a SMILES string and the other column being an ID. Any length IDs are valid, but only 16 characters will get carried into the DOCKing phase of the operation, see [[Mol2db2_Format_2]] for more details.<br />
<br />
Once the jobs have finished, you can run <br />
<br />
db2end-prefix.py name<br />
<br />
To build dockable name-XXXXXX.db2.gz files.<br />
<br />
''Protein Target Preparation''<br />
<br />
be-blasti is still the preferred method of downloading PDB files and splitting them into rec.pdb and xtal-lig.pdb files. Run it with<br />
<br />
be_blasti.csh filename<br />
<br />
The filename should be a file that contains PDB codes you want to download.<br />
<br />
Once you get a rec.pdb (representing the protein) and xtal-lig.pdb (representing the ligand or a set of atoms in the binding site of the protein) (these names can be changed as well, see options), you can run blastermaster.py, the new version of DOCK Blaster. Try running the help to see the extensive options:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -h<br />
<br />
A typical way of running it is to just run it as:<br />
<br />
$DOCK_BASE/src/blastermaster_1.0/blastermaster.py -v<br />
<br />
-v gives you verbose output, which can be helpful if something goes wrong. If everything is successful, you'll see this at the end of the file:<br />
<br />
copying matching_spheres.sph into dockfiles<br />
copying trim.electrostatics.phi into dockfiles<br />
copying ligand.desolv.hydrogen into dockfiles<br />
copying ligand.desolv.heavy into dockfiles<br />
copying vdw.bmp into dockfiles<br />
copying vdw.vdw into dockfiles<br />
copying vdw.parms.amb.mindock into dockfiles<br />
writing INDOCK file: INDOCK<br />
<br />
Otherwise something went wrong. Notice that you have an INDOCK file written for you, with many defaults set that you may want to change. Also, old INDOCK files are slightly incompatible with the new files, so you should consult the changed version written for you, or take a finer look at the page on the [[DOCK3.7 INDOCK]] file.<br />
<br />
"Running DOCK"<br />
<br />
Setting up an alias for the dock37tools directory in $DOCK_BASE/src/dock37tools/ is highly recommended, though you don't need it if you don't want it. Assume you have it set as $d37 from here on out with a command line this<br />
<br />
setenv d37 $DOCK_BASE/src/dock37tools/<br />
<br />
In that directory, there are 3 scripts for setting up a dock run. The first script will send setup a docking run where each file will be relegated to a separate node, fine for quick jobs.<br />
<br />
$d37/setup_db2.csh /full/path/to/db2/files/<br />
<br />
Another script, useful for testing on DUD-E targets, is:<br />
<br />
$d37/setup_db2_own.csh /full/path/to/db2/files/<br />
<br />
A file script, useful for running many db2.gz files, sometimes with a few files grouped together, like for prospective screening against lead-like is:<br />
<br />
$d37/setup_db2_lots.py desiredDirectoryCount prefixName /full/path/to/db2/files/<br />
<br />
After setting up any of these runs, you can run them with the following:<br />
<br />
$d37/submit.csh<br />
<br />
or $d37/subdock.csh /path/to/dock.csh if you have compiled your own version of DOCK3.7.<br />
<br />
<br />
<br />
<br />
<br />
"Analyzing DOCK results"</div>Rgchttp://wiki.docking.org/index.php?title=DOCK3.7_INDOCK&diff=5382DOCK3.7 INDOCK2013-03-27T15:03:56Z<p>Rgc: </p>
<hr />
<div>This page describes the updates to the [[DOCK3.7]] INDOCK file, alongside a sample file. blastermaster.py described on the [[DOCK3.7]] page writes an INDOCK file for you, which you can modify.<br />
<br />
DOCK 3.7 parameter<br />
##################################################### <br />
# NOTE: split_database_index is reserved to specify a list of files<br />
ligand_atom_file split_database_index #standard for docking large databases<br />
#####################################################<br />
# OUTPUT<br />
output_file_prefix test. #default, but can be changed<br />
#####################################################<br />
# MATCHING<br />
match_method 2 #1 matches up to the distance_tolerance below, ignoring match_goal, step, maximum, etc. <br />
#2 uses the adaptive sampling that attempts to get a number of match_goal orientational samples <br />
distance_tolerance 0.05 #starting distance tolerance<br />
match_goal 5000 #desired number of orientational samples to get before quitting under match_method = 2<br />
distance_step 0.05 #increment from distance_tolerance until max or match_goal is reached<br />
distance_maximum 0.5 #biggest tolerance that will be used to attempt to get match_goal orientational samples<br />
timeout 10.0 #number of seconds before quitting on any given ligand<br />
nodes_maximum 4 #max number of points for which all distances must be within the tolerance. 3 possible, 4 suggested.<br />
nodes_minimum 4 #min number of points for which all distances must be within the tolerance. 4 suggested, 3 possible.<br />
bump_maximum 50.0 #van der Waals score in kcal/mol for any part of the molecule to get before further examination stopped<br />
bump_rigid 50.0 #van der Waals score in kcal/mol for the rigid component of the ligand molecule, if above, discarded<br />
#####################################################<br />
# COLORING<br />
chemical_matching no #default to off, can use chemical matching from DOCK3.6 if desired<br />
case_sensitive no #case sensitivity for chemical matching groups<br />
#####################################################<br />
# SEARCH MODE<br />
atom_minimum 4 #minimum number of atoms in ligand for it to be scored<br />
atom_maximum 100 #maximum number of atoms in ligand for it to be scored<br />
number_save 1 #how many poses to save. Any number of poses can be saved, but disk space is a factor!<br />
molecules_maximum 100000 #how many molecules will be searched before quitting. raise if databases are large.<br />
#####################################################<br />
# SCORING<br />
ligand_desolvation volume #use GB desolvation scoring, other options are full or none<br />
vdw_maximum 1.0e10 #maximum vdw score possible, prevents overflow<br />
electrostatic_scale 1.0 #scaling factors to be applied to scores, likely not to be trifled with<br />
vdw_scale 1.0 #again, scales the entire vdw score.<br />
internal_scale 0.0 #scales an internal focusing term. set this to 0 as this doesn't work at all/isn't implemented<br />
per_atom_scores no #change to yes if per-atom scoring breakdowns desired. note that this doubles output size.<br />
#####################################################<br />
# INPUT FILES / THINGS THAT CHANGE<br />
receptor_sphere_file ../dockfiles/matching_spheres.sph #receptor matching spheres file following age old SPH format<br />
vdw_parameter_file ../dockfiles/vdw.parms.amb.mindock #vdw parameter file.<br />
flexible_receptor no #describing only single receptor file for now<br />
total_receptors 1 <br />
############## grids/data for one receptor<br />
rec_number 1<br />
rec_group 1<br />
rec_group_option 1<br />
solvmap_file ../dockfiles/ligand.desolv.heavy #GB-based desolvation maps <br />
hydrogen_solvmap_file ../dockfiles/ligand.desolv.hydrogen<br />
delphi_file ../dockfiles/trim.electrostatics.phi #electrostatics map, size must be declared with delphi_nsize below<br />
chemgrid_file ../dockfiles/vdw.vdw #vdw grid file, contains vdw scores<br />
bumpmap_file ../dockfiles/vdw.bmp #vdw bump file, only used for header data for chemgrid_file <br />
############## end of INDOCK<br />
delphi_nsize 47 #size of electrostatics grid (cubic). blastermaster.py trims to the minimum size necessary to save memory.</div>Rgchttp://wiki.docking.org/index.php?title=DOCK3.7_INDOCK&diff=5381DOCK3.7 INDOCK2013-03-26T21:55:56Z<p>Rgc: DOCK3.7 indock file documentation.</p>
<hr />
<div>This page describes the updates to the [[DOCK3.7]] INDOCK file, alongside a sample file. blastermaster.py described on the [[DOCK3.7]] page writes an INDOCK file for you, which you can modify.<br />
<br />
DOCK 3.7 parameter<br />
##################################################### <br />
# NOTE: split_database_index is reserved to specify a list of files<br />
ligand_atom_file split_database_index #standard for docking large databases<br />
#####################################################<br />
# OUTPUT<br />
output_file_prefix test. #default, but can be changed<br />
#####################################################<br />
# MATCHING<br />
match_method 2 #1 matches up to the distance_tolerance below, ignoring match_goal, step, maximum, etc. <br />
#2 uses the adaptive sampling that attempts to get a number of match_goal orientational samples <br />
distance_tolerance 0.05 #starting distance tolerance<br />
match_goal 5000 #desired number of orientational samples to get before quitting under match_method = 2<br />
distance_step 0.05 #increment from distance_tolerance until max or match_goal is reached<br />
distance_maximum 0.5 #biggest tolerance that will be used to attempt to get match_goal orientational samples<br />
timeout 10.0 #number of seconds before quitting on any given ligand<br />
nodes_maximum 4 #max number of points for which all distances must be within the tolerance. 3 possible, 4 suggested.<br />
nodes_minimum 4 #min number of points for which all distances must be within the tolerance. 4 suggested, 3 possible.<br />
bump_maximum 50.0 #van der Waals score in kcal/mol for any part of the molecule to get before further examination stopped<br />
bump_rigid 50.0 #van der Waals score in kcal/mol for the rigid component of the ligand molecule, if above, discarded<br />
#####################################################<br />
# COLORING<br />
chemical_matching no #default to off, can use chemical matching from DOCK3.6 if desired<br />
case_sensitive no #case sensitivity for chemical matching groups<br />
#####################################################<br />
# SEARCH MODE<br />
atom_minimum 4 #minimum number of atoms in ligand for it to be scored<br />
atom_maximum 100 #maximum number of atoms in ligand for it to be scored<br />
number_save 1 #how many poses to save. Any number of poses can be saved, but disk space is a factor!<br />
molecules_maximum 10000 #how many molecules will be searched before quitting. raise if databases are large.<br />
#####################################################<br />
# SCORING<br />
ligand_desolvation volume #use GB desolvation scoring, other options are full or none<br />
vdw_maximum 1.0e10 #maximum vdw score possible, prevents overflow<br />
electrostatic_scale 1.0 #scaling factors to be applied to scores, likely not to be trifled with<br />
vdw_scale 1.0 #again, scales the entire vdw score.<br />
internal_scale 0.0 #scales an internal focusing term. set this to 0 as this doesn't work at all/isn't implemented<br />
per_atom_scores no #change to yes if per-atom scoring breakdowns desired. note that this doubles output size.<br />
#####################################################<br />
# INPUT FILES / THINGS THAT CHANGE<br />
receptor_sphere_file ../dockfiles/matching_spheres.sph #receptor matching spheres file following age old SPH format<br />
vdw_parameter_file ../dockfiles/vdw.parms.amb.mindock #vdw parameter file.<br />
flexible_receptor no #describing only single receptor file for now<br />
total_receptors 1 <br />
############## grids/data for one receptor<br />
rec_number 1<br />
rec_group 1<br />
rec_group_option 1<br />
solvmap_file ../dockfiles/ligand.desolv.heavy #GB-based desolvation maps <br />
hydrogen_solvmap_file ../dockfiles/ligand.desolv.hydrogen<br />
delphi_file ../dockfiles/trim.electrostatics.phi #electrostatics map, size must be declared with delphi_nsize below<br />
chemgrid_file ../dockfiles/vdw.vdw #vdw grid file, contains vdw scores<br />
bumpmap_file ../dockfiles/vdw.bmp #vdw bump file, only used for header data for chemgrid_file <br />
############## end of INDOCK<br />
delphi_nsize 47 #size of electrostatics grid (cubic). blastermaster.py trims to the minimum size necessary to save memory.</div>Rgchttp://wiki.docking.org/index.php?title=GPCR_Waiver_Wire&diff=5337GPCR Waiver Wire2013-01-18T17:17:44Z<p>Rgc: taking myself off the list, since it is ignored anyway.</p>
<hr />
<div>When new projects come into the lab, the person on the top of the order gets first choice to either take or pass the project. The project will be moved down the order until it is claimed. Once a person claims a project, they will be moved to the bottom of the order. This process ideally ensures a fair process for determining who gets new projects and allows those that don't speak up as much to have a fair chance of getting a new, great project. Passing on a project will not effect the position in the order since not all projects are suitable for all persons. <br />
<br />
'''Brian reserves the right to change the order at any time for any reason, e.g. passing on too many projects, new member in lab getting bumped up, etc...'''<br />
<br />
== Updated guidelines: ==<br />
a. People get first refusal for projects in the order they are listed. If they choose and are placed on a project they go to the bottom of the list.<br />
<br />
b. Occasionally projects will come along where Brian feels someone has particular expertise or history. Brian reserves the right to over-rule the waiver list order in these circumstances.<br />
<br />
c. On the GPCR projects teams have worked out very well. There must be a more senior partner and a more junior partner in these collaborations, if only because there is only one first co-first author. To encourage collaboration, anyone who volunteers to be the more junior partner in a project will not lose their place on the waiver list. Please discuss with the senior partner first to ensure compatibility. <br />
<br />
----<br />
<br />
== Project waiver order: ==<br />
<br />
<br />
*Magdalena Korczynska<br />
*John J. Irwin<br />
*Matt Merski<br />
*Sarah Barelier<br />
*Joel Karpiak<br />
*Henry Lin<br />
*Nir London<br />
*Trent Balius<br />
*Dahlia Weiss<br />
*Hao Fan<br />
*Oliv Eidam</div>Rgchttp://wiki.docking.org/index.php?title=Blaster_Issues&diff=5283Blaster Issues2012-10-30T23:37:06Z<p>Rgc: problem report from magdalena, low priority but things we should be aware of</p>
<hr />
<div>This page describes current problems with DOCK Blaster. <br />
<br />
= Pose Fidelity = <br />
1. 1abc fails with 5A rmsd. Due to XXX<br />
2. xxx<br />
<br />
= Enrichment = <br />
1. <br />
<br />
= Big ligands =<br />
1. this is too big<br />
<br />
<br />
= Auto prep script = <br />
works as far as I know on most targets, with the following exceptions. <br />
-irons are now protonated<br />
-asps are now sometimes protonated<br />
<br />
<br />
= Other failures = <br />
<br />
<br />
<br />
-- Francesco . Jul 10, 2008<br />
<br />
Back to [[DOCK Blaster]]<br />
[[Category:DOCK Blaster]]</div>Rgchttp://wiki.docking.org/index.php?title=User:Rgc&diff=5214User:Rgc2012-10-12T17:25:33Z<p>Rgc: </p>
<hr />
<div>Ryan Coleman. Postdoc in the Shoichet Lab.[[http://rgc.name]]</div>Rgchttp://wiki.docking.org/index.php?title=Cluster_Usage&diff=434Cluster Usage2012-09-18T19:20:09Z<p>Rgc: </p>
<hr />
<div>Information on how to use the Shoichet Lab Cluster. <br />
Note this information is only relevant if you have ssh access to the cluster.<br />
<br />
To request a shell on a cluster node, type <br />
qrsh <br />
Or for a shell on a specific set of nodes (say any from the node-5-* or node-6-* sets):<br />
qrsh -q all.q -l hostname="node-[56]*" -now no<br />
<br />
The -now no parameter is necessary if the cluster is otherwise full and you'll have to wait to get a job.<br />
<br />
If you're interested in writing your own cluster scripts, the nice way of doing this is to use the following lines:<br />
<br />
#$ -q all.q<br />
<br />
The first line uses the correct queue, all.q for basically any job. <br />
<br />
#$ -t 1-500<br />
<br />
This line (modify the 500 to mean the max) should be used in concert with $SGE_TASK_ID to write array job scripts, instead of scripts that only run single jobs. <br />
<br />
Back to [[Portal:Lab]]<br />
<br />
<br />
[[Category:Internal]]</div>Rgchttp://wiki.docking.org/index.php?title=Mol2db2_Format_2&diff=3781Mol2db2 Format 22012-07-11T19:35:47Z<p>Rgc: </p>
<hr />
<div>This page is a wishlist for features that would be nice for a new version of the flexibase file format to support. mol2db2 format features that are actually implemented so far are marked [x]<br />
<br />
= New Features =<br />
== implemented ==<br />
*Real Atom Types and Bond Information [x]<br />
*Way to determine which mix-and-match conformations have clashes (and avoid trying them) [x]<br />
*A place to store an internal energy for each possible conformation [x]<br />
*Terminal hydrogen rotations?? [x]<br />
*support for clusters of conformations [x]<br />
*arbitrary information to be written into output mol2 file (5th and above M lines) [x]<br />
<br />
== wished ==<br />
*Per-conformation per-atom partial charge & solvation information to support internal energies<br />
*Aliphatic ring movements?<br />
*group tagging (needed for covalent docking) and basic set of covalent groups<br />
*specified rigid component override (and better rules for finding non-ring rigid components)<br />
*per molecule pKa<br />
*valence for each atom<br />
<br />
= File Format =<br />
==current plan for the file format ==<br />
*T type information (implicitly assumed)<br />
*M molecule (4 lines req'd, after that they are optional, 24 lines max)<br />
*A atoms<br />
*B bond<br />
*X xyz <br />
*R rigid xyz for matching (can actually be any xyzs) <br />
*C conformation<br />
*S sets<br />
*D clusters<br />
*E end of molecule<br />
<br />
T ## namexxxx (implicitly assumed to be the standard 7)<br />
M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters<br />
M charge polar_solv apolar_solv total_solv surface_area<br />
M smiles<br />
M longname<br />
[M arbitrary information preserved for writing out]<br />
A stuff about each atom, 1 per line <br />
B stuff about each bond, 1 per line<br />
X coordnum atomnum confnum x y z <br />
R rigidnum color x y z<br />
C confnum coordstart coordend<br />
S setnum #lines #confs_total broken hydrogens omega_energy<br />
S setnum linenum #confs confs [until full column]<br />
D clusternum setstart setend matchstart matchend #additionalmatching<br />
D matchnum color x y z<br />
E <br />
<br />
With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.<br />
<br />
notes: 17 children groups/group per line in current scheme.<br />
9 children confs/group per line.<br />
9 children confs/conf per line.<br />
8 confs/set per line.<br />
groups/confs with no children are written out.<br />
<br />
on the atom line, dt is dock type and co is color.<br />
<br />
1 2 3 4 5 6 7<br />
01234567890123456789012345678901234567890123456789012345678901234567890123456789<br />
T ## typename<br />
M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU<br />
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
[M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]<br />
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
B NUM ATO ATO TY<br />
X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
C CONFNO COORDSTAR COORDENDX<br />
S SETIDX #LINES #CO C H +ENERGY.XXX<br />
S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS<br />
D CLUSID STASET ENDSET MST MEN ADD<br />
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
E<br />
<br />
the type lines following are assumed by dock unless overriden:<br />
T 1 positive<br />
T 2 negative<br />
T 3 acceptor<br />
T 4 donor<br />
T 5 ester_o<br />
T 6 amide_o<br />
T 7 neutral<br />
<br />
the following are the format statements for python for each line<br />
T %2d %8s\n<br />
M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n<br />
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
M %77s\n<br />
M %77s\n<br />
M %77s\n<br />
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
B %3d %3d %3d %-2s\n<br />
X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n<br />
R %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
C %6d %9d %9d\n<br />
S %6d %6d %3d %1d %1d %+11.3f\n<br />
S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n <br />
D %6d %6d %6d %3d %3d %3d\n<br />
D %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
E\n<br />
<br />
The following are the fortran77 format statements<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
1000 format(2x,i2,1x,a8)<br />
!M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters<br />
2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)<br />
!M smiles or longname<br />
2200 format(2x,a77)<br />
!A stuff about each atom, 1 per line<br />
3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,<br />
& f10.3,1x,f10.3,1x,f9.3)<br />
!B stuff about each bond, 1 per line<br />
4000 format(2x,i3,1x,i3,1x,i3,1x,a2)<br />
!X atomnum confnum x y z<br />
5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)<br />
!R rigidnum color x y z<br />
6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)<br />
!C confnum #startcoord #endcoord<br />
7000 format(2x,i6,1x,i9,1x,i9)<br />
!S setnum #lines #confs_total broken hydrogens omega_energy<br />
8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)<br />
!S setnum linenum #confs confs [until full column]<br />
8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,<br />
& 1x,i6,1x,i6,1x,i6,1x,i6)<br />
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN<br />
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)<br />
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
!re-use 6000<br />
!E<br />
!E does not get a format line<br />
<br />
The following are Fortran95 format statements:<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000<br />
!M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters<br />
character (len=*), parameter :: DB2M1 =<br />
& '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
character (len=*), parameter :: DB2M2 =<br />
& '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100<br />
!M smiles/longname/arbitrary<br />
character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200<br />
!A stuff about each atom, 1 per line<br />
character (len=*), parameter :: DB2ATOM =<br />
& '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x,<br />
& f10.3,x,f10.3,x,f9.3)' !3000<br />
!B stuff about each bond, 1 per line<br />
character (len=*), parameter :: DB2BOND =<br />
& '(2x,i3,x,i3,x,i3,x,a2)' !4000<br />
!X coordnumx atomnum confnum x y z<br />
character (len=*), parameter :: DB2COORD =<br />
& '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000<br />
!R rigidnum color x y z<br />
character (len=*), parameter :: DB2RIGID =<br />
& '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000<br />
!C confnum coordstart coordend<br />
character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000<br />
!S setnum #lines #confs_total broken hydrogens omega_energy <br />
character (len=*), parameter :: DB2SET1 =<br />
& '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000<br />
!S setnum linenum #confs confs [until full column]<br />
character (len=*), parameter :: DB2SET2 =<br />
& '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6,<br />
& 1x,i6,x,i6,x,i6,x,i6)' !8100<br />
!D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d)<br />
character (len=*), parameter :: DB2CLUSTER =<br />
& '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000<br />
!D NUM CO x y z<br />
!reuse DB2RIGID<br />
!E<br />
!E does not get a format line <br />
<br />
[[Category:Wishlists]]</div>Rgchttp://wiki.docking.org/index.php?title=Mol2db2_Format_2&diff=3779Mol2db2 Format 22012-07-09T18:09:25Z<p>Rgc: another wishlist item</p>
<hr />
<div>This page is a wishlist for features that would be nice for a new version of the flexibase file format to support. mol2db2 format features that are actually implemented so far are marked [x]<br />
<br />
*Real Atom Types and Bond Information [x]<br />
*Way to determine which mix-and-match conformations have clashes (and avoid trying them) [x]<br />
*A place to store an internal energy for each possible conformation [x]<br />
*Terminal hydrogen rotations?? [x]<br />
*Per-conformation per-atom partial charge & solvation information to support internal energies<br />
*Aliphatic ring movements?<br />
*support for clusters of conformations [x]<br />
*group tagging (needed for covalent docking) and basic set of covalent groups<br />
*specified rigid component override (and better rules for finding non-ring rigid components)<br />
*per molecule pKa<br />
*valence for each atom<br />
*arbitrary information to be written into output mol2 file (5th and above M lines) [x]<br />
<br />
the following represents the current plan for the file format<br />
*T type information (implicitly assumed)<br />
*M molecule (4 lines req'd, after that they are optional, 24 lines max)<br />
*A atoms<br />
*B bond<br />
*X xyz <br />
*R rigid xyz for matching (can actually be any xyzs) <br />
*C conformation<br />
*S sets<br />
*D clusters<br />
*E end of molecule<br />
<br />
T ## namexxxx (implicitly assumed to be the standard 7)<br />
M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters<br />
M charge polar_solv apolar_solv total_solv surface_area<br />
M smiles<br />
M longname<br />
[M arbitrary information preserved for writing out]<br />
A stuff about each atom, 1 per line <br />
B stuff about each bond, 1 per line<br />
X coordnum atomnum confnum x y z <br />
R rigidnum color x y z<br />
C confnum coordstart coordend<br />
S setnum #lines #confs_total broken hydrogens omega_energy<br />
S setnum linenum #confs confs [until full column]<br />
D clusternum setstart setend matchstart matchend #additionalmatching<br />
D matchnum color x y z<br />
E <br />
<br />
With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.<br />
<br />
notes: 17 children groups/group per line in current scheme.<br />
9 children confs/group per line.<br />
9 children confs/conf per line.<br />
8 confs/set per line.<br />
groups/confs with no children are written out.<br />
<br />
on the atom line, dt is dock type and co is color.<br />
<br />
1 2 3 4 5 6 7<br />
01234567890123456789012345678901234567890123456789012345678901234567890123456789<br />
T ## typename<br />
M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU<br />
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
[M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]<br />
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
B NUM ATO ATO TY<br />
X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
C CONFNO COORDSTAR COORDENDX<br />
S SETIDX #LINES #CO C H +ENERGY.XXX<br />
S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS<br />
D CLUSID STASET ENDSET MST MEN ADD<br />
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
E<br />
<br />
the type lines following are assumed by dock unless overriden:<br />
T 1 positive<br />
T 2 negative<br />
T 3 acceptor<br />
T 4 donor<br />
T 5 ester_o<br />
T 6 amide_o<br />
T 7 neutral<br />
<br />
the following are the format statements for python for each line<br />
T %2d %8s\n<br />
M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n<br />
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
M %77s\n<br />
M %77s\n<br />
M %77s\n<br />
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
B %3d %3d %3d %-2s\n<br />
X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n<br />
R %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
C %6d %9d %9d\n<br />
S %6d %6d %3d %1d %1d %+11.3f\n<br />
S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n <br />
D %6d %6d %6d %3d %3d %3d\n<br />
D %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
E\n<br />
<br />
The following are the fortran77 format statements<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
1000 format(2x,i2,1x,a8)<br />
!M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters<br />
2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)<br />
!M smiles or longname<br />
2200 format(2x,a77)<br />
!A stuff about each atom, 1 per line<br />
3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,<br />
& f10.3,1x,f10.3,1x,f9.3)<br />
!B stuff about each bond, 1 per line<br />
4000 format(2x,i3,1x,i3,1x,i3,1x,a2)<br />
!X atomnum confnum x y z<br />
5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)<br />
!R rigidnum color x y z<br />
6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)<br />
!C confnum #startcoord #endcoord<br />
7000 format(2x,i6,1x,i9,1x,i9)<br />
!S setnum #lines #confs_total broken hydrogens omega_energy<br />
8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)<br />
!S setnum linenum #confs confs [until full column]<br />
8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,<br />
& 1x,i6,1x,i6,1x,i6,1x,i6)<br />
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN<br />
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)<br />
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
!re-use 6000<br />
!E<br />
!E does not get a format line<br />
<br />
The following are Fortran95 format statements:<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000<br />
!M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters<br />
character (len=*), parameter :: DB2M1 =<br />
& '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
character (len=*), parameter :: DB2M2 =<br />
& '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100<br />
!M smiles/longname/arbitrary<br />
character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200<br />
!A stuff about each atom, 1 per line<br />
character (len=*), parameter :: DB2ATOM =<br />
& '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x,<br />
& f10.3,x,f10.3,x,f9.3)' !3000<br />
!B stuff about each bond, 1 per line<br />
character (len=*), parameter :: DB2BOND =<br />
& '(2x,i3,x,i3,x,i3,x,a2)' !4000<br />
!X coordnumx atomnum confnum x y z<br />
character (len=*), parameter :: DB2COORD =<br />
& '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000<br />
!R rigidnum color x y z<br />
character (len=*), parameter :: DB2RIGID =<br />
& '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000<br />
!C confnum coordstart coordend<br />
character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000<br />
!S setnum #lines #confs_total broken hydrogens omega_energy <br />
character (len=*), parameter :: DB2SET1 =<br />
& '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000<br />
!S setnum linenum #confs confs [until full column]<br />
character (len=*), parameter :: DB2SET2 =<br />
& '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6,<br />
& 1x,i6,x,i6,x,i6,x,i6)' !8100<br />
!D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d)<br />
character (len=*), parameter :: DB2CLUSTER =<br />
& '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000<br />
!D NUM CO x y z<br />
!reuse DB2RIGID<br />
!E<br />
!E does not get a format line <br />
<br />
[[Category:Wishlists]]</div>Rgchttp://wiki.docking.org/index.php?title=Hit_picking_party&diff=3179Hit picking party2012-07-07T05:40:51Z<p>Rgc: added Nir's PAINS link</p>
<hr />
<div>Before we buy compounds, we have a hit picking party.<br />
<br />
** this document is still in progress.<br />
<br />
<br />
= Before the party =<br />
The investigator who performed the virtual screen looks critically at the results, selecting perhaps a dozen or two interesting compounds from among the top 500. Several copies of the top 500 list are printed, sometimes with additional supporting documentation, and distributed to the participants. Check [[CSD]] data for compound pose information. The [[Pka|pKa]] of each molecule can be checked. Think about [[reactive groups]] for molecules. Think about PAINS http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter<br />
<br />
= During the party = <br />
We take a good look at the site, and any crystallographic ligands or known experimental ligands. Think about enthalpic and entropic contributions to the binding. Think about possible water structure. Think of receptor flexibility, especially His, Gln, Asn, Ser, Tyr, Thr. Look for charged residues. Look at the surface colored by charge to look for greasy patches. <br />
<br />
Look at one ligand at a time, and rate each compound (e.g. 1, 2, or 3 checkmarks). People speak out, making favorable and unfavorable comments about each ligand. Broken molecules are rapidly ignored.<br />
<br />
Watch out for very high or very low calculated LogP. Watch for weird conformations and weird protonation states and weird tautomeric forms. ZINC is imperfect, use your judgement. Watch out for super floppy molecules. Watch out for buried polarity or missed opportunities. hydroxide is the worst. If hydroxide ain't happy ain't nobody happy.<br />
<br />
= After the party = <br />
The investigator purchases compounds as discussed and has them tested experimentally. Ideally, a report of the fate of each compound can be shared with other members of the group within a few weeks. We try to test at as high concentration as the assay will allow. We like to order 10mg so that we can test for purity, identity, and also repeat the experiment.<br />
<br />
= Epilogue = <br />
Sometimes we go through several iterations. If no hits arise, we go back to the drawing board. We try to incorporate all available information, both in the modeling, and in the judgement of the ligands. Watch out for aggregators, and be sure to control for these during the experiments. [http://advisor.docking.org/]<br />
<br />
<br />
-- John Irwin<br />
<br />
<br />
[[Category:Tutorials]]</div>Rgchttp://wiki.docking.org/index.php?title=Visualizing_delphi&diff=4655Visualizing delphi2012-03-26T17:05:06Z<p>Rgc: </p>
<hr />
<div>Tutorial on visualizing [[Delphi]] and understanding electrostatic scoring in [[DOCK 3]].<br />
<br />
# Open the pdb and phi files in PyMOL<br />
<br />
In a normal DOCK run, this can be done by typing the following in the PyMOL command window:<br />
<br />
cd ~/directory/<br />
load rec.pdb<br />
load grids/rec+sph.phi<br />
<br />
[[Image:pdb.png|thumb|none]]<br />
<br />
# go to the apbs tools plugin under the plugin menu<br />
<br />
[[Image:Apbs_tools.png|thumb|none]]<br />
<br />
# go to the visualization tab<br />
<br />
[[Image:Visualization.png|thumb|none]]<br />
<br />
# display the potentials at a certain kT value<br />
<br />
For each kind of surface (Positive or Negative) you can set the value at which the isosurface will be drawn at. The following example is for 10 and -10:<br />
<br />
[[Image:Tenminusten.png|thumb|none]]<br />
<br />
This means that anywhere at this surface, the potential for a charged atom can be calculated as 10*charge (and roughly *0.6 to convert to kcal/mol instead of kT). Of course this is relatively meaningless over the whole protein, what really matters is near the binding site where ligands will be docked.<br />
<br />
After loading the crystal structure ligand and zooming on it and changing the potentials to 50 and -50 you get the following:<br />
<br />
[[Image:50minus50ligand.png|thumb|none]]<br />
<br />
Looking at the binding site, you can see that one part of the binding site has a very strong potential, the scoring function's desire to place the correctly charged ligands here is very strong.<br />
<br />
[[Category: Tutorials]]<br />
[[Category:DOCK]]<br />
[[Category:DOCK:Theory]]</div>Rgchttp://wiki.docking.org/index.php?title=Flexibase_Format&diff=2844Flexibase Format2012-02-28T18:06:29Z<p>Rgc: making documentation match reality</p>
<hr />
<div>[[Image:mol2db.gif|left|]] Hierarchy Generator<br />
<br />
<br />
<br />
Input: 1. multi-conformer mol2 file, 2. solvation file, and 3. inhier parameters<br />
<br />
Output: 4. database in hierarchy format, 5. molecule summary (stdout)<br />
<br />
<br />
<br />
Important notes: <br />
<br />
* The hierarchy generator does not know if hydrogens have been rotated. Turning on torque_hydrogens if hydrogens have already been rotated, will result in duplicate structures and inaccurate counting of conformations.<br />
* The hierarchy dock code will not read a database with hierarchy spacing. <br />
<br />
<br />
<br />
1. Multi-conformer mol2 file<br />
<br />
This file is a standard Tripos mol2 file. Multiple conformations of the same molecule must have the same MFCD number. The file must be under the UNIX file size limit of 2GB<br />
<br />
<br />
<br />
2. Solvation file<br />
<br />
This file contains abbreviated output from AMSOL. Each molecule record begins with a line containing the identification number followed by the total number of atoms and the formal charge. An important note is that the identification number here must match the identification number in the mol2 file! The next numbers are the total polar solvation energy, the total solvent accessible surface, total apolar solvation energy and total solvation energy. After this header, there is a line of data for each atom. The first number is the partial atomic charge, followed by the polar solvation energy, the solvent accessible surface area, apolor solvation energy and finally the total solvation energy of the atom.<br />
<br />
<br />
<br />
3. Hierarchy generator input parameters (inhier)<br />
<br />
The hierarchy generator is called by typing the path to the executable followed by the name of the input parameter file (can be anything, by default inhier). Logical keywords (procedures) can be yes/no or true/false.<br />
<br />
protein Is the input file a list of protein side chains or small molecules? Protein side chains do not allow for solvation values and preserve the input residue and atom names. If this option is true, solvation correction should be false; comment out the solvation_table line.<br />
<br />
equalize_charges This option adjusts formal charge and equalizes charges on equivalent groups. Set this value to No, as charges are corrected outside of the hierarchy generator. Code for this routine was modified from mol2db from the dock suite<br />
<br />
solvation-correction Should the hierarchy generator look for solvation data? If yes, molecules with solvation data will have their charges replaced with those from the solvation data table and have solvation data added. Molecules not listed in the solvation data table will retain their original charges, have zeros for the solvation numbers and have a -3 for the sixth value of the branch header line. The molecules with the -3 will be skipped by dock.<br />
<br />
color_atoms Should the generator put atom colors on each atoms? No reason to turn this option off. Code for this routine was copied from mol2db from the dock suite.<br />
<br />
output_color_table Should the database header be printed? When generating a small database (few hundred molecules) I leave this on. For large databases that will be joined together I leave it off. If this is left off, it needs to be added post database generation to the final joined files.<br />
<br />
translate_coordinates Should the coordinates be translated to the origin. Mol2db translated everything to the origin to keep the coordinates small (database spacing). This should no longer be a problem. The ACD can be generated with this set to No.<br />
<br />
hierarchy_spacing The hierarchy spacing option is only designed for visual inspection of the hierarchy. Dock will not read files with hierarchy spacing. For database generation, set this value to No.<br />
<br />
torque_hydrogens If the input conformations do not have multiple conformations for the hydrogens, this option will generate multiple conformations. Groups rotated include =NH, -SH, and -OH. They are rotated in 180°, 30°, and 30° increments respectively. If the -SH or OH are connected to an aromatic system they are rotated in 60° increments.<br />
<br />
mol2_file_list The hierarchy generator can process multiple files (gzipped or not). I recommend commenting this line out and calling the generator for each file form a shell script. This option is not compatible with mol2_file<br />
<br />
mol2_file This is the multi-conformer mol2 file used in database generation. This option is not compatible with mol2_file_list<br />
<br />
db_file This is the output file to which the generator writes the hierarchy.<br />
<br />
solvation_table The file from which the generator reads solvation and charge data.<br />
<br />
color_table This keyword marks the beginning of the color table. The color table follows the syntax rules for the mol2db (dock 3.5) color table. The end of the color table is marked by the keyword default_color and the value neutral<br />
<br />
Here are some additional notes on the color table, note that the atom names are sybyl atom names http://tripos.com/mol2/atom_types.html :<br />
<br />
rules are last match counts. so even though all Ns are positive, the later<br />
rule N.ar matches acceptor so N.ar is an acceptor<br />
first rules are beginning of sybyl atom text -> type<br />
later rules are Atom NotBondedTo Atom -> type like<br />
O. -1 N.2 -> negative<br />
other rules are Atom BondsAwayFrom Atom -> type like<br />
C.2 1 N. -> positive or<br />
0.2 2 N.3 -> amide_o<br />
again rules are read in order and the last matching rule is the one used<br />
<br />
<br />
4. Hierarchy database format<br />
<br />
The hierarchy database format is based on the dock 3.5 database format. The first lines include a header (line #1) and describe the colors used in the database (line #2-8). For the purposes of this discussion, the lines have been numbered (#1, #2, etc.) and the hierarchy has been indented to help distinguish the levels.<br />
<br />
<br />
<br />
Family header: Line #9 is designed for future use. The word Family is followed by a chemical family number. This is hard coded to 1 (first number). The second number is the number of molecules in this family (matches the number of occurrences of line #10), the third is the number of branches that get attached to each molecule. The last number is the number of branches in the family. Line #10 has the first 50 characters of the molecule name and the last letter and number of the identification code (e.g., MFCD, SPEC, ZINC). After a space there can be up to 10 branches listed (10i2). The rigid fragment is considered branch one so the example molecule has branches two and three. By listed which branches make up which molecules, I can later rebuild each molecule and recombine side chains. Any branch in the first position can be combined with any branch in the second position, etc.<br />
<br />
<br />
<br />
Branches: Lines #11, #36, and #44 are branch identifiers. The first number in the first line of a branch lists how many coordinates are listed for the branch. The second number is the number of atoms in the branch (single conformer). This is followed by the number of heavy atoms and number of hydrogens in the branch. Next is the sum of the polar solvation energy for the entire molecule (not just the branch). Next is the number one, or if the molecule lacks solvation energy, -3. A value of -3 causes dock to skip the molecule. Next is the aploar component of desolvation, again for the entire molecule. The number zero was to denote the number of explicit conformations, but is now an open field for future use. The last number of the branch identifier line is the number of conformations for that branch (including recombination within the branch). Multiplying all of these conformation counts together for a given molecule results in the number of conformations for the molecule.<br />
<br />
<br />
<br />
Atom information: The atom information is divided into two parts. First is all atom information except or coordinates, followed by multiple sets of coordinates. Both parts of the atom information start with the hierarchy level.<br />
<br />
* Hierarchy level: The rigid fragment is 9. All groups attached to the rigid fragment are numbered in the 10's; groups attached to those are numbered in the 20's. The tens (and if need be the hundreds place) denotes distance from the rigid fragment. The ones place differentiates independent groups within the branch. Branches from the rigid fragment can be numbered either, 19, 18, etc., or all 19. Originally the each needed to be different, but as the code evolved this was no longer required.<br />
* Atom information: This block of information (line #12-23, #37, and #45-52) contains information we consider to be the same for all conformations. It also describes all of the parts (number of atoms each group) required to complete the branch. After the hierarchy information, the van der Waals type as described in the Dock 3.5 manual is listed. Next the partial atomic charge multiplied by 10000. This is historical; the multiplication should probably be removed at some point. At the end of the charge is a column that is set to zero. This value is the flag (see dock 2.5). Currently the hierarchy generator and dock do not use flagging. Next is the color, corresponding to the color table at the top of the database file. The last two columns are the polar and apolar partial atomic desolvation values.<br />
* Atom coordinates: The first number is the hierarchy level followed by the xyz coordinates for the atom (multiplied by 1000). Line #38-42 list six different sets of coordinates for the hydrogen described in line #37. Line #53-55 are the one set of coordinates for the two carbons and hydrogen in group 18. For each position of group 18 (line #53-55, #69-71, #85-87, and #101-103) there are three positions for group 29 (Cl and 2 H) and two positions for group 28, the carboxylic acid. Since groups 29 and 28 can move independently, they have different group numbers. For each of the four positions for group 18 there are six downstream combinations of 29 and 28. This leads to the reported 24 positions for the branch. Any of these positions can be combined with any of the six torsions of the hydrogen leading to 144 conformations for the molecule. Note that the information about how many different sets of coordinates is not encoded anywhere, you have to count the lines, when there is one atom and then 6 different sets of coordinates (like lines #38-42) you know there are 6 conformations for that atom. When there are 3 atoms and 9 lines of coordinates (like lines #56-64), you know there are 3 conformations for that set of atoms. Note also that no information about the tree is explicitly encoded, if you are reading from group 49 for instance and then read a line of group 29 you have to move back up the tree of conformations to that level. The output order of the tree of conformations is 'infix'.<br />
<br />
<br />
5. Molecule summary:<br />
<br />
D MFCD rigid flex I_ats I_confs O_ats O_confs O_hconfs<br />
<br />
D00000000 12 9 120 12 82 24 144<br />
<br />
MFCD: Id number from the mol2 file (format MFCD12345678 à D12345678)<br />
<br />
Rigid: Number of atoms in the same position in all conformations (12)<br />
<br />
Flex: Total number of atoms in the molecule minus the rigid atoms (9)<br />
<br />
I_ats: Input atoms -- the number of atoms required to represent the molecule in ensemble format. Rigid atoms plus the number of conformations times number of flexible atoms (120)<br />
<br />
I_confs: Number of conformations read in for the molecule (12)<br />
<br />
O_ats: Number of atoms written to the hierarchy -- usually less than I_ats unless lots of added hydrogen coordinates. (82)<br />
<br />
O_confs: Number of conformations, including recombination. This number does not include hydrogen conformations added by the hierarchy generator. This number will frequently be larger than I_confs (recombination), but can also be less than I_confs if the input conformations are degenerate. (24)<br />
<br />
O_hconfs: Number of conformations including hydrogens rotated by the hierarchy generator. This is the number of conformations dock will see. (144)<br />
<br />
<br />
<br />
Error messages<br />
<br />
no common atoms D00000000<br />
<br />
For the specified molecule id there were no atoms with common coordinates. Nothing is written to the database file<br />
<br />
Macrocycle error on D0000000<br />
<br />
The recursion routine used in generating the hierarchy has problems with macrocyles that are not the rigid fragment. Nothing is written to the database file.<br />
<br />
<br />
Here is the example file for the molecule shown above. The line numbers are shown at left, the column spacing is not accurate due to formatting changes.<br />
<code><pre> <br />
# 1 DOCK 5.1 ligand_atoms<br />
# 2 positive (1)<br />
# 3 negative (2)<br />
# 4 acceptor (3)<br />
# 5 donor (4)<br />
# 6 ester_o (5)<br />
# 7 amide_o (6)<br />
# 8 neutral (7)<br />
# 9 Family 1 1 2 2<br />
# 10 Molecule_to_describe_the_hierarchy_format D00000000 2 3<br />
# 11 12 12 8 4 -49.0700 1 -1.59 0 1<br />
# 12 9 1 1380 7 1.570 0.160<br />
# 13 9 1 -2410 7 -1.530 0.220<br />
# 14 9 1 -970 7 -0.510 0.230<br />
# 15 9 1 -1370 7 -2.080 0.260<br />
# 16 9 1 -380 7 -1.610 -0.070<br />
# 17 9 1 -1350 7 -4.130 0.220<br />
# 18 9 5 -560 7 -0.640 0.010<br />
# 19 912 -4710 4 -6.840 -1.350<br />
# 20 9 7 1190 7 1.100 -0.030<br />
# 21 9 7 1190 7 1.210 -0.030<br />
# 22 9 7 1180 7 1.790 -0.030<br />
# 23 9 7 1270 7 1.260 -0.030<br />
# 24 9 -1204 1354 0<br />
# 25 9 0 659 0<br />
# 26 9 0 -732 0<br />
# 27 9 -1204 -1427 0<br />
# 28 9 -2408 -732 0<br />
# 29 9 -2408 658 0<br />
# 30 9 -3866 -1528 -49<br />
# 31 9 -1237 2718 6<br />
# 32 9 919 1190 0<br />
# 33 9 919 -1263 0<br />
# 34 9 -1204 -2489 0<br />
# 35 9 -3327 1189 0<br />
# 36 6 1 0 1 -49.0700 1 -1.59 0 6<br />
# 37 19 6 3880 7 2.390 -0.620<br />
# 38 19 -328 3071 -166<br />
# 39 19 -937 3059 -874<br />
# 40 19 -1854 3036 -699<br />
# 41 19 -2161 3025 181<br />
# 42 19 -1552 3036 889<br />
# 43 19 -635 3059 714<br />
# 44 64 8 5 3 -49.0700 1 -1.59 0 24<br />
# 45 18 5 -100 7 0.290 0.460<br />
# 46 18 1 4520 7 14.160 0.650<br />
# 47 18 7 820 7 2.350 -0.010<br />
# 48 2916 -2080 7 -1.680 -0.080<br />
# 49 29 7 940 7 2.110 -0.020<br />
# 50 29 7 1210 7 1.780 -0.010<br />
# 51 2811 -6070 2 -26.310 0.240<br />
# 52 2811 -7570 2 -33.750 -1.760<br />
# 53 18 -3882 -3141 -97<br />
# 54 18 -4926 -1167 1143<br />
# 55 18 -4160 -1125 -1029<br />
# 56 29 -3131 -4036 1247<br />
# 57 29 -4935 -3454 -148<br />
# 58 29 -3261 -3392 -970<br />
# 59 29 -3084 -3955 -1465<br />
# 60 29 -3396 -3499 823<br />
# 61 29 -4946 -3397 -213<br />
# 62 29 -5457 -3970 -149<br />
# 63 29 -3325 -3445 -995<br />
# 64 29 -3448 -3445 867<br />
# 65 28 -5017 -86 1728<br />
# 66 28 -5869 -2094 1551<br />
# 67 28 -5397 -53 1380<br />
# 68 28 -5422 -2160 1970<br />
# 69 18 -5233 -672 -47<br />
# 70 18 -4072 -2582 -1283<br />
# 71 18 -3736 -2021 926<br />
# 72 29 -5511 482 -1375<br />
# 73 29 -6066 -1390 -59<br />
# 74 29 -5154 -39 849<br />
# 75 29 -5558 401 1337<br />
# 76 29 -5225 -44 -950<br />
# 77 29 -6027 -1431 7<br />
# 78 29 -6782 -1549 -100<br />
# 79 29 -5234 -64 870<br />
# 80 29 -5205 -115 -995<br />
# 81 28 -3182 -3226 -1841<br />
# 82 28 -5339 -2863 -1765<br />
# 83 28 -3377 -3574 -1512<br />
# 84 28 -5132 -2438 -2162<br />
# 85 18 -4582 -1948 1335<br />
# 86 18 -5060 -798 -895<br />
# 87 18 -3466 -2424 -546<br />
# 88 29 -5016 -663 2488<br />
# 89 29 -5510 -2477 1073<br />
# 90 29 -3821 -2541 1864<br />
# 91 29 -3714 -3043 2439<br />
# 92 29 -4786 -1021 1890<br />
# 93 29 -5463 -2526 1019<br />
# 94 29 -6143 -2801 1255<br />
# 95 29 -3882 -2601 1878<br />
# 96 29 -4821 -989 1817<br />
# 97 28 -4908 -64 -1874<br />
# 98 28 -6387 -1016 -571<br />
# 99 28 -5028 -515 -2095<br />
#100 28 -6262 -483 -287<br />
#101 18 -4533 -1865 -1479<br />
#102 18 -3939 -2951 755<br />
#103 18 -4429 -722 443<br />
#104 29 -3627 -2892 -2616<br />
#105 29 -5492 -2366 -1280<br />
#106 29 -4594 -890 -1985<br />
#107 29 -4929 -512 -2567<br />
#108 29 -3835 -2522 -2017<br />
#109 29 -5510 -2302 -1225<br />
#110 29 -6096 -2719 -1504<br />
#111 29 -4677 -908 -2003<br />
#112 29 -3831 -2571 -1946<br />
#113 28 -3290 -3248 1761<br />
#114 28 -4821 -3940 357<br />
#115 28 -3745 -3111 1962<br />
#116 28 -4291 -4115 96<br />
</pre></code><br />
<br />
[[Category:Articles needing style editing]]<br />
[[Category:Formats]]</div>Rgchttp://wiki.docking.org/index.php?title=Mol2db_Format_2&diff=3783Mol2db Format 22012-02-14T18:51:23Z<p>Rgc: Mol2db Format 2 moved to Mol2db2 Format 2</p>
<hr />
<div>#REDIRECT [[Mol2db2 Format 2]]</div>Rgchttp://wiki.docking.org/index.php?title=Mol2db2_Format_2&diff=3778Mol2db2 Format 22012-02-14T18:51:23Z<p>Rgc: Mol2db Format 2 moved to Mol2db2 Format 2</p>
<hr />
<div>This page is a wishlist for features that would be nice for a new version of the flexibase file format to support. mol2db2 format features that are actually implemented so far are marked [x]<br />
<br />
*Real Atom Types and Bond Information [x]<br />
*Way to determine which mix-and-match conformations have clashes (and avoid trying them) [x]<br />
*A place to store an internal energy for each possible conformation [x]<br />
*Terminal hydrogen rotations?? [x]<br />
*Per-conformation per-atom partial charge & solvation information to support internal energies<br />
*Aliphatic ring movements?<br />
*support for clusters of conformations [x]<br />
*group tagging (needed for covalent docking) and basic set of covalent groups<br />
*specified rigid component override (and better rules for finding non-ring rigid components)<br />
*per molecule pKa<br />
*arbitrary information to be written into output mol2 file (5th and above M lines) [x]<br />
<br />
the following represents the current plan for the file format<br />
*T type information (implicitly assumed)<br />
*M molecule (4 lines req'd, after that they are optional, 24 lines max)<br />
*A atoms<br />
*B bond<br />
*X xyz <br />
*R rigid xyz for matching (can actually be any xyzs) <br />
*C conformation<br />
*S sets<br />
*D clusters<br />
*E end of molecule<br />
<br />
T ## namexxxx (implicitly assumed to be the standard 7)<br />
M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters<br />
M charge polar_solv apolar_solv total_solv surface_area<br />
M smiles<br />
M longname<br />
[M arbitrary information preserved for writing out]<br />
A stuff about each atom, 1 per line <br />
B stuff about each bond, 1 per line<br />
X coordnum atomnum confnum x y z <br />
R rigidnum color x y z<br />
C confnum coordstart coordend<br />
S setnum #lines #confs_total broken hydrogens omega_energy<br />
S setnum linenum #confs confs [until full column]<br />
D clusternum setstart setend matchstart matchend #additionalmatching<br />
D matchnum color x y z<br />
E <br />
<br />
With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.<br />
<br />
notes: 17 children groups/group per line in current scheme.<br />
9 children confs/group per line.<br />
9 children confs/conf per line.<br />
8 confs/set per line.<br />
groups/confs with no children are written out.<br />
<br />
on the atom line, dt is dock type and co is color.<br />
<br />
1 2 3 4 5 6 7<br />
01234567890123456789012345678901234567890123456789012345678901234567890123456789<br />
T ## typename<br />
M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU<br />
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
[M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]<br />
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
B NUM ATO ATO TY<br />
X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
C CONFNO COORDSTAR COORDENDX<br />
S SETIDX #LINES #CO C H +ENERGY.XXX<br />
S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS<br />
D CLUSID STASET ENDSET MST MEN ADD<br />
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
E<br />
<br />
the type lines following are assumed by dock unless overriden:<br />
T 1 positive<br />
T 2 negative<br />
T 3 acceptor<br />
T 4 donor<br />
T 5 ester_o<br />
T 6 amide_o<br />
T 7 neutral<br />
<br />
the following are the format statements for python for each line<br />
T %2d %8s\n<br />
M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n<br />
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
M %77s\n<br />
M %77s\n<br />
M %77s\n<br />
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
B %3d %3d %3d %-2s\n<br />
X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n<br />
R %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
C %6d %9d %9d\n<br />
S %6d %6d %3d %1d %1d %+11.3f\n<br />
S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n <br />
D %6d %6d %6d %3d %3d %3d\n<br />
D %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
E\n<br />
<br />
The following are the fortran77 format statements<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
1000 format(2x,i2,1x,a8)<br />
!M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters<br />
2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)<br />
!M smiles or longname<br />
2200 format(2x,a77)<br />
!A stuff about each atom, 1 per line<br />
3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,<br />
& f10.3,1x,f10.3,1x,f9.3)<br />
!B stuff about each bond, 1 per line<br />
4000 format(2x,i3,1x,i3,1x,i3,1x,a2)<br />
!X atomnum confnum x y z<br />
5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)<br />
!R rigidnum color x y z<br />
6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)<br />
!C confnum #startcoord #endcoord<br />
7000 format(2x,i6,1x,i9,1x,i9)<br />
!S setnum #lines #confs_total broken hydrogens omega_energy<br />
8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)<br />
!S setnum linenum #confs confs [until full column]<br />
8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,<br />
& 1x,i6,1x,i6,1x,i6,1x,i6)<br />
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN<br />
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)<br />
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
!re-use 6000<br />
!E<br />
!E does not get a format line<br />
<br />
The following are Fortran95 format statements:<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000<br />
!M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters<br />
character (len=*), parameter :: DB2M1 =<br />
& '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
character (len=*), parameter :: DB2M2 =<br />
& '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100<br />
!M smiles/longname/arbitrary<br />
character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200<br />
!A stuff about each atom, 1 per line<br />
character (len=*), parameter :: DB2ATOM =<br />
& '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x,<br />
& f10.3,x,f10.3,x,f9.3)' !3000<br />
!B stuff about each bond, 1 per line<br />
character (len=*), parameter :: DB2BOND =<br />
& '(2x,i3,x,i3,x,i3,x,a2)' !4000<br />
!X coordnumx atomnum confnum x y z<br />
character (len=*), parameter :: DB2COORD =<br />
& '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000<br />
!R rigidnum color x y z<br />
character (len=*), parameter :: DB2RIGID =<br />
& '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000<br />
!C confnum coordstart coordend<br />
character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000<br />
!S setnum #lines #confs_total broken hydrogens omega_energy <br />
character (len=*), parameter :: DB2SET1 =<br />
& '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000<br />
!S setnum linenum #confs confs [until full column]<br />
character (len=*), parameter :: DB2SET2 =<br />
& '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6,<br />
& 1x,i6,x,i6,x,i6,x,i6)' !8100<br />
!D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d)<br />
character (len=*), parameter :: DB2CLUSTER =<br />
& '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000<br />
!D NUM CO x y z<br />
!reuse DB2RIGID<br />
!E<br />
!E does not get a format line <br />
<br />
[[Category:Wishlists]]</div>Rgchttp://wiki.docking.org/index.php?title=Qnifft_DOCK_3.6_conversion&diff=4211Qnifft DOCK 3.6 conversion2012-02-13T21:08:52Z<p>Rgc: </p>
<hr />
<div>Qnifft is a new option for use instead of DelPhi with DOCK 3.6 It is a poisson boltzmann solver program from Kim Sharp [[http://crystal.med.upenn.edu/software.html]]. It has been integrated into the [[DOCK Blaster]] and [[DOCK 3.6]] toolchain. For now if you make use of it, please cite:<br />
<br />
Sharp, K. A. 1995. Polyelectrolyte electrostatics: Salt dependence, entropic and enthalpic contributions to free energy in the nonlinear Poisson-Boltzmann model. Biopolymers 36:227-243. [http://dx.doi.org/10.1002/bip.360360210 10.1002/bip.360360210]<br />
<br />
and <br />
<br />
Gallagher, K., and K. A. Sharp. 1998. Electrostatic Contributions to Heat Capacity Changes of DNA-Ligand Binding. Biophys. J. 75:769-776.[http://dx.doi.org/10.1016/S0006-3495(98)77566-6 http://dx.doi.org/10.1016/S0006-3495(98)77566-6]<br />
<br />
<br />
== Using the new code ==<br />
<br />
A compiled qnifft binary is in $DOCK_BASE/bin/Linux/qnifft22_193_pgf_32<br />
<br />
Running qnifft requires setting your $DELDIR environment variable to $DOCK_BASE/src/qnifft<br />
<br />
The default way to run qnifft is to copy the qnifft.parm file from $DELDIR and run it by calling<br />
qnifft qnifft.parm<br />
<br />
If you're using DOCK Blaster, you can make the new electrostatic grids by typing:<br />
<br />
make grids/rec+sph.qnifft.phi<br />
<br />
or if you want to use the full DOCK Blaster toolchain you can type<br />
<br />
make autonew<br />
<br />
Once you have the new phimap, you have to edit your INDOCK to point to it instead of the old phimap (rec+sph.phi usually). Once you've done that, you also have to use the new DOCK executable located in $DOCK_BASE/bin/Linux/dock.csh The best way to use this is to use the following command instead of $mud/submit.csh:<br />
<br />
$mud/subdock.csh $DOCK_BASE/bin/Linux/dock.csh<br />
<br />
This should produce compatible OUTDOCK & test.eel1.gz files.<br />
<br />
== Recompiling DOCK 3.6 to use the new Qnifft grids ==<br />
<br />
If you're using a different version of DOCK 3.6 and want to change it to be compatible with Qnifft-produced grids, you only have to change one line. In max.h change <br />
<br />
parameter (nsize=179)<br />
<br />
to <br />
<br />
parameter (nsize=193)<br />
<br />
Then you have to run<br />
<br />
cd i386 ; make clean ; make ; make SIZE=32<br />
<br />
To produce new binaries for use with these grids.<br />
<br />
== INDOCK file ==<br />
<br />
delphi_file ../../grids/rec+sph.qnifft.phi<br />
delphi_nsize 193<br />
<br />
== Backwards compatibility with the old grids ==<br />
<br />
Edit your INDOCK file, add a delphi_nsize parameter and set it to 179. This allows use of old delphi grids instead of the 193 sized Qnifft grids.<br />
<br />
[[Category:DOCK]]</div>Rgchttp://wiki.docking.org/index.php?title=Enm_explorer&diff=1488Enm explorer2012-02-13T21:07:31Z<p>Rgc: link to paper</p>
<hr />
<div>==Overview==<br />
<br />
3KENM was developed by Qingyi Yang. Please use this reference:<br />
<br />
Yang, Q., Sharp, K. A. 2008. Building alternate protein structures using the elastic network model. Proteins, 74:682-700. [http://onlinelibrary.wiley.com/doi/10.1002/prot.22184/full]<br />
<br />
The idea is to deform a protein structure along its normal modes while preserving physically reliable bonds and angles, as well as preserving secondary structure. <br />
<br />
==Usage==<br />
<br />
At the moment only mode "1" has been tested. All other modes of running the code are likely broken.<br />
Input should be a pdb '''with backbone atoms only'''.<br />
<br />
Run:<br />
<br />
~dahlia/software/3kenm/enm_explorer_drw/enm_explorer 1<br />
<br />
Enter .pdb file to deform:<br />
myfile_backbone.pdb<br />
<br />
Enter maximum rms deviation cutoff:<br />
1.0<br />
<br />
Enter maximum change in bond length (recommended 0.1-0.4):<br />
0.4<br />
<br />
<br />
<br />
WARNING: '''The rms cutoff must be a float!'''<br />
<br />
The larger the protein is, the more your individual bond lengths will be distorted. You will have to play with this parameter. If the output is only '''1 frame''', you need to increase this number. <br />
<br />
The output is a crd file. You will now need to convert to a series of PDB files (credit Ryan Coleman). <br />
<br />
Use: <br />
~rgc/Source/sharp_src/pdbcrd2pdbs.py pdbName crdName outputPdbPrefix</div>Rgchttp://wiki.docking.org/index.php?title=Db2multipdb.py&diff=1303Db2multipdb.py2012-02-08T16:56:48Z<p>Rgc: </p>
<hr />
<div>db2multipdb.py is a small python script used to decode [[Flexibase Format|Flexibase]] .db files to multipdb files (that can be read by any viewer) and do some simple checking on the .db file.<br />
<br />
Usage: db2multipdb.py [options] file.db [more db files]<br />
<br />
Convert .db files to multiple pdb files, check for errors<br />
<br />
Options:<br />
-h, --help show this help message and exit<br />
-v, --verbose lots of debugging output<br />
-n, --nopdb don't write pdb files, just do broken checking<br />
<br />
The script is located at <br />
$DOCK_BASE/scripts/db2multipdb.py<br />
or alternatively<br />
~rgc/Source/bks_src/db2multipdb.py<br />
If you don't have python2.6 in your path you'll have to put it there or run the program like this<br />
/usr/arch/bin/python2.6 ~rgc/Source/bks_src/db2multipdb.py<br />
<br />
Verbose output (-v flag) is not typically needed but available.<br />
Not writing pdb files (-n flag) is a useful option if you don't need the pdb files and just want to do the broken checking. Each separate .db entry generates the following output to stdout:<br />
<br />
P00000008 being processed now 1<br />
P00000008 1 errors of each type: 542 0 0 0 no errors: 817 total models 1359<br />
roughly interpreted as:<br />
zincid^^^ #times zincid seen^^^^ a^^ b c d ^^^^^^^^^^ #without errors #total<br />
<br />
where a,b,c,d type errors are defined as <br />
<br />
a is atoms closer than 0.95 angstroms<br />
b is oxygen atoms closer than 2.0 angstroms<br />
c is heavy atoms closer than 1.07 angstroms<br />
d is no other atoms within 2.2 angstroms<br />
<br />
Note that for a given conformation, only one error of any type is reported. Type d (critical) errors take precedence over type a,b,c errors, if there is a type d error it will be reported. The number of errors of each type plus the number of models without errors always equals the total number of models.<br />
<br />
Exactly which atoms have these errors can be seen with the -v option. Errors of the first 3 types are expected due to the 'mix-and-match' conformations generated by separate flexible branches being recombined and overlapping. Errors of type d should <i>not</i> occur but have been known to previously.<br />
<br />
If pdb output is not suppressed files will be written named P00000008.001.pdb where the first 9 characters are the ZINCID read from the db file, then a unique counter (since a .db file can contain multiple .db entries for one ZINCID). Each pdb file is a normal pdb file, with each MODEL as one unique conformation produced. Obviously these can be quite large and writing them to disk takes much longer than anything else the code does. If you load this pdb file in PyMOL for instance and then hit the 'play' button it will go through the entire set. Obviously other post-processing or conversion is possible.<br />
<br />
Questions? contact [[User:Rgc|Ryan Coleman]]<br />
[[Category:Software]][[Category:Tutorials]]</div>Rgchttp://wiki.docking.org/index.php?title=Db2multipdb.py&diff=1302Db2multipdb.py2012-02-08T16:56:39Z<p>Rgc: </p>
<hr />
<div>db2multipdb.py is a small python script used to decode [[Flexibase Format|Flexibase]] .db files to multipdb files (that can be read by any viewer) and do some simple checking on the .db file.<br />
<br />
Usage: db2multipdb.py [options] file.db [more db files]<br />
<br />
Convert .db files to multiple pdb files, check for errors<br />
<br />
Options:<br />
-h, --help show this help message and exit<br />
-v, --verbose lots of debugging output<br />
-n, --nopdb don't write pdb files, just do broken checking<br />
<br />
The script is located at <br />
$DOCK_BASE/scripts/db2multipdb.py<br />
or alternatively<br />
~rgc/Source/bks_src/db2multipdb.py<br />
If you don't have python2.6 in your path you'll have to put it there or run the program like this<br />
/usr/arch/bin/python2.6 ~rgc/Source/bks_src/db2multipdb.py<br />
<br />
Verbose output (-v flag) is not typically needed but available.<br />
Not writing pdb files (-n flag) is a useful option if you don't need the pdb files and just want to do the broken checking. Each separate .db entry generates the following output to stdout:<br />
<br />
P00000008 being processed now 1<br />
P00000008 1 errors of each type: 542 0 0 0 no errors: 817 total models 1359<br />
roughly interpreted as:<br />
zincid^^^ #times zincid seen^^^^ a^^ b c d ^^^^^^^^^^ #without errors #total<br />
<br />
where a,b,c,d type errors are defined as <br />
<br />
a is atoms closer than 0.95 angstroms<br />
b is oxygen atoms closer than 2.0 angstroms<br />
c is heavy atoms closer than 1.07 angstroms<br />
d is no other atoms within 2.2 angstroms<br />
<br />
Note that for a given conformation, only one error of any type is reported. Type d (critical) errors take precedence over type a,b,c errors, if there is a type d error it will be reported. The number of errors of each type plus the number of models without errors always equals the total number of models.<br />
<br />
Exactly which atoms have these errors can be seen with the -v option. Errors of the first 3 types are expected due to the 'mix-and-match' conformations generated by separate flexible branches being recombined and overlapping. Errors of type d should <i>not</i> occur but have been known to previously.<br />
<br />
If pdb output is not suppressed files will be written named P00000008.001.pdb where the first 9 characters are the ZINCID read from the db file, then a unique counter (since a .db file can contain multiple .db entries for one ZINCID). Each pdb file is a normal pdb file, with each MODEL as one unique conformation produced. Obviously these can be quite large and writing them to disk takes much longer than anything else the code does. If you load this pdb file in PyMOL for instance and then hit the 'play' button it will go through the entire set. Obviously other post-processing or conversion is possible.<br />
<br />
Questions? contact [[User:Rgc|Ryan Coleman]]<br />
[[Category:Software]][[Category:Tutorials]]</div>Rgchttp://wiki.docking.org/index.php?title=Multimol2db.py&diff=3802Multimol2db.py2012-02-08T01:00:23Z<p>Rgc: added some tips to get this to work from Nir</p>
<hr />
<div>'''multimol2db.py'''<br />
<br />
This script is a utility program that takes as input a .mol2 file that has been protonated and all conformations have been generated with OMEGA (or alternatively, a mol2 file from some other source) and runs AMSOL & mol2db on it to make .db files for docking.<br />
<br />
multimol2db.py input.mol2<br />
<br />
It is very important that the beginning of your .mol2 file contains this kind of header:<br />
<br />
@<TRIPOS>MOLECULE<br />
TEMP12345678<br />
70 72 0 0 0<br />
SMALL<br />
NO_CHARGES<br />
<br />
@<TRIPOS>ATOM<br />
1 C1 5.1180 4.5740 2.9690 C.3 1 UNK1 0.0182<br />
2 N1 4.4470 5.0610 4.2130 N.4 1 UNK1 -0.5553<br />
<br />
Otherwise AMSOL and the associated scripts that run it will crash. The most important part is the second line that needs to be in the form XXXX00000000<br />
<br />
Other tips:<br />
<br />
1. Don't have any extra lines except the MOLECULE, ATOM and BOND records<br />
<br />
2. The last 3 columns of the atom record are important (the 1 UNK1 0.0182), some .mol2 files don't have them so just add dummy (1 UNK1 0.0000) to yours.<br />
<br />
The file is in your dockenv/scripts or $DOCK_BASE/scripts/<br />
<br />
A version is also kept is ~/Source/bks_src/multimol2db.py<br />
<br />
[[User:Rgc]]</div>Rgchttp://wiki.docking.org/index.php?title=Dock_Ligand_Clustering&diff=1381Dock Ligand Clustering2012-02-06T23:34:16Z<p>Rgc: </p>
<hr />
<div>DOCK 3.6 Ligand Clustering<br />
<br />
* branch of [[DOCK_3.6]]<br />
* Contributed by Niu Huang's lab<br />
* Currently in the niu-nibs-clustering branch, ready for use<br />
* Replacement for single mode<br />
* Example INDOCK file parameters, add to your current INDOCK file<br />
<br />
#<br />
################################################################################<br />
# POSE_CLUSTERING<br />
#<br />
pose_clustering yes<br />
# "yes/no" General switch of pose clustering (default: no)<br />
pose_clustering_detail no<br />
# "yes/no" Switch of print out detail infomation (default: no)<br />
pose_clustering_method 0<br />
# 0---------Write all the poses<br />
# 1---------Clustering method based on KGS penalty function<br />
# 2---------Clustering method based on ratio of RANK and RMSD<br />
pose_clustering_cutoff 1.1<br />
# pre-determined variable for clustering cutoff<br />
# In method 0, the cutoff is unused.<br />
# In method 1, the cutoff is unused.<br />
# In method 2, the cutoff is Nc.<br />
################################################################################<br />
# cutoff in pose_clustering for saving resource<br />
#pose_clustering_RMSD_cutoff<br />
#pose_clustering_energy_cutoff no<br />
pose_clus_inputno_cutoff 0<br />
# number: input number cutoff; 0: no; -1: rotatable bond based (NA)<br />
# number of poses to be clustered according to energy rank<br />
# for method 0, it is also used.<br />
pose_clus_outputno_cutoff 0<br />
# number: output number cutoff; 0: no<br />
# number of clustered poses being written out according to energy rank<br />
# for method 0, it is also used.<br />
################################################################################<br />
<br />
Currently installed in<br />
<br />
~xyz/dockenv/bin/Linux/ligandclustering/<br />
<br />
Bonus: A version compiled with nsize=193 for use with Qnifft [[Qnifft_DOCK_3.6_conversion]] is in <br />
<br />
~xyz/dockenv/bin/Linux/ligandclustering-qnifft/<br />
<br />
Or where ~xyz/dockenv is your $DOCK_BASE<br />
<br />
[[Category:DOCK]]</div>Rgchttp://wiki.docking.org/index.php?title=Dock_Ligand_Clustering&diff=1380Dock Ligand Clustering2012-02-06T23:34:02Z<p>Rgc: </p>
<hr />
<div>DOCK 3.6 Ligand Clustering<br />
<br />
* branch of [[DOCK_3.6]]<br />
* Contributed by Niu Huang's lab<br />
* Currently in the niu-nibs-clustering branch, ready for use<br />
* Replacement for single mode<br />
* Example INDOCK file parameters, add to your current INDOCK file<br />
<br />
#<br />
################################################################################<br />
# POSE_CLUSTERING<br />
#<br />
pose_clustering yes<br />
# "yes/no" General switch of pose clustering (default: no)<br />
pose_clustering_detail no<br />
# "yes/no" Switch of print out detail infomation (default: no)<br />
pose_clustering_method 0<br />
# 0---------Write all the poses<br />
# 1---------Clustering method based on KGS penalty function<br />
# 2---------Clustering method based on ratio of RANK and RMSD<br />
pose_clustering_cutoff 1.1<br />
# pre-determined variable for clustering cutoff<br />
# In method 0, the cutoff is unused.<br />
# In method 1, the cutoff is unused.<br />
# In method 2, the cutoff is Nc.<br />
################################################################################<br />
# cutoff in pose_clustering for saving resource<br />
#pose_clustering_RMSD_cutoff<br />
#pose_clustering_energy_cutoff no<br />
pose_clus_inputno_cutoff 0<br />
# number: input number cutoff; 0: no; -1: rotatable bond based (NA)<br />
# number of poses to be clustered according to energy rank<br />
# for method 0, it is also used.<br />
pose_clus_outputno_cutoff 0<br />
# number: output number cutoff; 0: no<br />
# number of clustered poses being written out according to energy rank<br />
# for method 0, it is also used.<br />
################################################################################<br />
<br />
Currently installed in<br />
<br />
~xyz/dockenv/bin/Linux/ligandclustering/<br />
<br />
Bonus: A version compiled with nsize=193 for use with Qnifft [[http://wiki.bkslab.org/index.php/Qnifft_DOCK_3.6_conversion]] is in <br />
<br />
~xyz/dockenv/bin/Linux/ligandclustering-qnifft/<br />
<br />
Or where ~xyz/dockenv is your $DOCK_BASE<br />
<br />
[[Category:DOCK]]</div>Rgchttp://wiki.docking.org/index.php?title=Multimol2db.py&diff=3801Multimol2db.py2012-02-06T18:51:22Z<p>Rgc: explaining how henry broke everything</p>
<hr />
<div>'''multimol2db.py'''<br />
<br />
This script is a utility program that takes as input a .mol2 file that has been protonated and all conformations have been generated with OMEGA (or alternatively, a mol2 file from some other source) and runs AMSOL & mol2db on it to make .db files for docking.<br />
<br />
multimol2db.py input.mol2<br />
<br />
It is very important that the beginning of your .mol2 file contains this kind of header:<br />
<br />
@<TRIPOS>MOLECULE<br />
TEMP12345678<br />
70 72 0 0 0<br />
SMALL<br />
NO_CHARGES<br />
<br />
@<TRIPOS>ATOM<br />
<br />
Otherwise AMSOL and the associated scripts that run it will crash. The most important part is the second line that needs to be in the form XXXX00000000<br />
<br />
The file is in your dockenv/scripts or $DOCK_BASE/scripts/<br />
<br />
A version is also kept is ~/Source/bks_src/multimol2db.py<br />
<br />
[[User:Rgc]]</div>Rgchttp://wiki.docking.org/index.php?title=Multimol2db.py&diff=3800Multimol2db.py2012-02-06T18:50:07Z<p>Rgc: </p>
<hr />
<div>'''multimol2db.py'''<br />
<br />
This script is a utility program that takes as input a .mol2 file that has been protonated and all conformations have been generated with OMEGA (or alternatively, a mol2 file from some other source) and runs AMSOL & mol2db on it to make .db files for docking.<br />
<br />
multimol2db.py input.mol2<br />
<br />
It is very important that the beginning of your .mol2 file contains this kind of header:<br />
<br />
<br />
<br />
The file is in your dockenv/scripts or $DOCK_BASE/scripts/<br />
<br />
A version is also kept is ~/Source/bks_src/multimol2db.py<br />
<br />
[[User:Rgc]]</div>Rgchttp://wiki.docking.org/index.php?title=Multimol2db.py&diff=3799Multimol2db.py2012-02-03T22:27:58Z<p>Rgc: </p>
<hr />
<div>'''multimol2db.py'''<br />
<br />
This script is a utility program that takes as input a .mol2 file that has been protonated and all conformations have been generated with OMEGA (or alternatively, a mol2 file from some other source) and runs AMSOL & mol2db on it to make .db files for docking.<br />
<br />
multimol2db.py input.mol2<br />
<br />
The file is in your dockenv/scripts or $DOCK_BASE/scripts/<br />
<br />
A version is also kept is ~/Source/bks_src/multimol2db.py<br />
<br />
[[User:Rgc]]</div>Rgchttp://wiki.docking.org/index.php?title=Multimol2db.py&diff=3798Multimol2db.py2012-02-03T22:26:39Z<p>Rgc: </p>
<hr />
<div>'''multimol2db.py'''<br />
<br />
This script runs</div>Rgchttp://wiki.docking.org/index.php?title=Dockenv_Scripts&diff=1414Dockenv Scripts2012-02-03T22:26:14Z<p>Rgc: /* Utilities */</p>
<hr />
<div>= AMSOL = <br />
<br />
= DOCK Blaster Pipeline = <br />
<br />
= DOCK Expert = <br />
* scoreopt2 - score molecule in xpdb against grids (DOCK3.5! mods?)<br />
<br />
= WINC Pipeline = <br />
<br />
= ZINC Pipeline = <br />
<br />
= ZINC Curation = <br />
<br />
= Conversion tools = <br />
* getsdf - extract by code a subset of sdf file<br />
* getxpdb - subselect molecules based on molecule code<br />
<br />
= Utilities = <br />
* abut - put two files next to each other (source?)<br />
* molcharge_pdb - net molecule charges from xpdb format (e.g. .eel1)<br />
* mkpdbfmt - make pdb from xpdb (.eel1) file for viewing by midas<br />
* mkxpdb - make xpdb from (.eel1?) format file (purpose?)<br />
* get - get a molecule from an sdf file, make mol2 using omega<br />
* molcharge - original DL version - net charge on mols in mol2 format file<br />
* molcharge2 - rounding to two decimals<br />
* molcharge3 - rounding, uses PREFIX env variable<br />
* pc2unix - remove trailing \r from windows (uses tab2space, ctrl_m)<br />
* rename.csh - rename file en masse - modifyable script<br />
* amb2xpdb - convert amber to xpdb format - used during dock prep / Makefile<br />
* sph2pdb - convert spheres to PDB format<br />
* [[splitdb.py]] - takes input .db or .db.gz files and splits them into smaller files respecting hierarchy boundaries<br />
* [[multimol2db.py]] - takes input .mol2 file and run amsol & mol2db on it, but not omega. it must be protonated correctly as well.<br />
<br />
= Historical interest only = <br />
* get-green-sph.csh - wbq script to extract atoms from midas<br />
<br />
<br />
cod2smi.pl<br />
extractcode.csh<br />
fprint<br />
fprint2<br />
fprint3<br />
dl_fp<br />
fprint_all.csh<br />
reprocess.csh<br />
<br />
legacy - inherited scripts that ought to be adapted, reformulated or abandonded<br />
awk-scripts - bits of code for testing the database / atomtyp.txt<br />
awk.ctof - reformat mol2 to pdb; superceded by convsyb<br />
awk.tosph - reformat PDB to sph; superceded by pdb2sph<br />
bestscores - extract best scores from OUTDOCK file<br />
charge.spl - Sybyl based method to add H and compute charges<br />
superceded by omega (and amsol)<br />
chargrank - analysis - analyse eel1 output and compute charges (correct?)<br />
countcharge - analysis - count charges (correct?) superceded by docksum<br />
clu2pdb - convert cluster(?) format to pdb (usage?)<br />
conf_gen.spl - old sybyl script to generate conformations. not used.<br />
conversmi.csh - convert smiles to SLN - old script<br />
count_res - count residues in a PDB file (not used)<br />
fitatom.com - A McLachlan program to manipulate protein PDB files<br />
fitatom.loop - variant on the above<br />
getsdf.csh - extract sdf file and use filter during atomtyp.txt testing<br />
h-bond - make hb.coord and hb.1 from test.eel1 for midas<br />
hierstat - dockable database statistics<br />
midas2pdb - convert midas output to normal PDB<br />
mkc1 - coordinate file reformatting script bks 1996<br />
mklvlfmt - reformat file. (purpose?)<br />
mol2sln.com - convert mol2 to sln strings in sybyl<br />
extractcode.csh - get code, energy from eel1 file<br />
various old versions of fmtSDF.pl<br />
various old versions of getnext<br />
<br />
devtools - developer tools, not used for docking. not distributed<br />
hier_check - basic paranoia check on integrity of dockable database<br />
run on STDOUT from mol2db<br />
num - postpend line number to each line<br />
make_surf - command line to make a surface. usage of dms<br />
extract_random.pl - pick molecules randomly from list<br />
getseq.pl - get sequence from PDB file<br />
lclint - source checker - Linux - 3rd party<br />
passwd.pl - hash-encypted password creator<br />
snap.csh - take picture using indycam and put on web page<br />
splitdb.csh - script to split a database into chunks<br />
sumtime.pl - analyze output logs from database creation to give<br />
summary statistics for database production<br />
superdbmake.csh - script to subselect database from dockable database<br />
s_superdbmake.csh - database subselection script<br />
three21.pl - convert 3 letter codes to 1 letter aa codes for seq<br />
alignment<br />
<br />
<br />
scripts - non-executable files<br />
colortest.txt<br />
midock - used by midas to set up colors<br />
tab2space and ctrl_m should go here<br />
Makefile - site preparation (original by DL)<br />
add_h.spl - used by Makefile to add H to amber-format PDB<br />
dock.con - condor script <br />
filt.params - STDIN to filt.exe to find residues for input to ms/dms<br />
grids - templates for grid directory. instantiated by Makefile<br />
header.db - database color header<br />
inhier - small_mol_db_prep - input parameter template for mol2db<br />
inhier_prot - protein db prep - inhier template for protein-protein docking<br />
lig - templates for lig directory. instantiated by Makefile<br />
search - templates for search directory. instantiated by Makefile<br />
sph - templates for sph directory. instantiated by Makefile<br />
testing - templates for testing directory. instantiated by Makefile<br />
<br />
<br />
<br />
etc - scripts<br />
=============<br />
<br />
setup - set up the environment <br />
-----<br />
login - set up paths to scripts<br />
<br />
dbprep - general database preparation<br />
------<br />
usort1tok.pl - unique sort based on first token (e.g. smiles files)<br />
fmtSDF.pl - convert ISIS sdf into openeye-ready sdf<br />
makedb.com - convert AMSOL output into a database one tranche at a time<br />
splitmol.pl - split up solv files<br />
rename.csh - rename files - customize as required<br />
<br />
<br />
[[Category:DOCK]]<br />
[[Category:ZINC]]<br />
[[Category:Internal]]</div>Rgchttp://wiki.docking.org/index.php?title=Flexibase_Format&diff=2843Flexibase Format2012-01-19T19:26:12Z<p>Rgc: </p>
<hr />
<div>[[Image:mol2db.gif|left|]] Hierarchy Generator<br />
<br />
<br />
<br />
Input: 1. multi-conformer mol2 file, 2. solvation file, and 3. inhier parameters<br />
<br />
Output: 4. database in hierarchy format, 5. molecule summary (stdout)<br />
<br />
<br />
<br />
Important notes: <br />
<br />
* The hierarchy generator does not know if hydrogens have been rotated. Turning on torque_hydrogens if hydrogens have already been rotated, will result in duplicate structures and inaccurate counting of conformations.<br />
* The hierarchy dock code will not read a database with hierarchy spacing. <br />
<br />
<br />
<br />
1. Multi-conformer mol2 file<br />
<br />
This file is a standard Tripos mol2 file. Multiple conformations of the same molecule must have the same MFCD number. The file must be under the UNIX file size limit of 2GB<br />
<br />
<br />
<br />
2. Solvation file<br />
<br />
This file contains abbreviated output from AMSOL. Each molecule record begins with a line containing the identification number followed by the total number of atoms and the formal charge. An important note is that the identification number here must match the identification number in the mol2 file! The next numbers are the total polar solvation energy, the total solvent accessible surface, total apolar solvation energy and total solvation energy. After this header, there is a line of data for each atom. The first number is the partial atomic charge, followed by the polar solvation energy, the solvent accessible surface area, apolor solvation energy and finally the total solvation energy of the atom.<br />
<br />
<br />
<br />
3. Hierarchy generator input parameters (inhier)<br />
<br />
The hierarchy generator is called by typing the path to the executable followed by the name of the input parameter file (can be anything, by default inhier). Logical keywords (procedures) can be yes/no or true/false.<br />
<br />
protein Is the input file a list of protein side chains or small molecules? Protein side chains do not allow for solvation values and preserve the input residue and atom names. If this option is true, solvation correction should be false; comment out the solvation_table line.<br />
<br />
equalize_charges This option adjusts formal charge and equalizes charges on equivalent groups. Set this value to No, as charges are corrected outside of the hierarchy generator. Code for this routine was modified from mol2db from the dock suite<br />
<br />
solvation-correction Should the hierarchy generator look for solvation data? If yes, molecules with solvation data will have their charges replaced with those from the solvation data table and have solvation data added. Molecules not listed in the solvation data table will retain their original charges, have zeros for the solvation numbers and have a -3 for the sixth value of the branch header line. The molecules with the -3 will be skipped by dock.<br />
<br />
color_atoms Should the generator put atom colors on each atoms? No reason to turn this option off. Code for this routine was copied from mol2db from the dock suite.<br />
<br />
output_color_table Should the database header be printed? When generating a small database (few hundred molecules) I leave this on. For large databases that will be joined together I leave it off. If this is left off, it needs to be added post database generation to the final joined files.<br />
<br />
translate_coordinates Should the coordinates be translated to the origin. Mol2db translated everything to the origin to keep the coordinates small (database spacing). This should no longer be a problem. The ACD can be generated with this set to No.<br />
<br />
hierarchy_spacing The hierarchy spacing option is only designed for visual inspection of the hierarchy. Dock will not read files with hierarchy spacing. For database generation, set this value to No.<br />
<br />
torque_hydrogens If the input conformations do not have multiple conformations for the hydrogens, this option will generate multiple conformations. Groups rotated include =NH, -SH, and -OH. They are rotated in 180°, 30°, and 30° increments respectively. If the -SH or OH are connected to an aromatic system they are rotated in 60° increments.<br />
<br />
mol2_file_list The hierarchy generator can process multiple files (gzipped or not). I recommend commenting this line out and calling the generator for each file form a shell script. This option is not compatible with mol2_file<br />
<br />
mol2_file This is the multi-conformer mol2 file used in database generation. This option is not compatible with mol2_file_list<br />
<br />
db_file This is the output file to which the generator writes the hierarchy.<br />
<br />
solvation_table The file from which the generator reads solvation and charge data.<br />
<br />
color_table This keyword marks the beginning of the color table. The color table follows the syntax rules for the mol2db (dock 3.5) color table. The end of the color table is marked by the keyword default_color and the value neutral<br />
<br />
Here are some additional notes on the color table, note that the atom names are sybyl atom names http://tripos.com/mol2/atom_types.html :<br />
<br />
rules are last match counts. so even though all Ns are positive, the later<br />
rule N.ar matches acceptor so N.ar is an acceptor<br />
first rules are beginning of sybyl atom text -> type<br />
later rules are Atom NotBondedTo Atom -> type like<br />
O. -1 N.2 -> negative<br />
other rules are Atom BondsAwayFrom Atom -> type like<br />
C.2 1 N. -> positive or<br />
0.2 2 N.3 -> amide_o<br />
again rules are read in order and the last matching rule is the one used<br />
<br />
<br />
4. Hierarchy database format<br />
<br />
The hierarchy database format is based on the dock 3.5 database format. The first lines include a header (line #1) and describe the colors used in the database (line #2-8). For the purposes of this discussion, the lines have been numbered (#1, #2, etc.) and the hierarchy has been indented to help distinguish the levels.<br />
<br />
<br />
<br />
Family header: Line #9 is designed for future use. The word Family is followed by a chemical family number. This is hard coded to 1 (first number), but will be changed in the future to group molecules into families. The second number is the number of molecules in this family (matches the number of occurrences of line #10), the third is the number of branches that get attached to each molecule. The last number is the number of branches in the family. Line #10 has the first 50 characters of the molecule name and the last letter and number of the identification code (e.g., MFCD à D, SPEC à C). After a space there can be up to 10 branches listed (10i2). The rigid fragment is considered branch one so the example molecule has branches two and three. By listed which branches make up which molecules, I can later rebuild each molecule and recombine side chains. Any branch in the first position can be combined with any branch in the second position, etc.<br />
<br />
<br />
<br />
Branches: Lines #11, #36, and #44 are branch identifiers. The first number in the first line of a branch lists how many coordinates are listed for the branch. The second number is the number of atoms in the branch (single conformer). This is followed by the number of heavy atoms and number of hydrogens in the branch. Next is the sum of the polar solvation energy for the entire molecule (not just the branch). Next is the number one, or if the molecule lacks solvation energy, -3. A value of -3 causes dock to skip the molecule. Next is the aploar component of desolvation, again for the entire molecule. The number zero was to denote the number of explicit conformations, but is now an open field for future use. The last number of the branch identifier line is the number of conformations for that branch (including recombination within the branch). Multiplying all of these conformation counts together for a given molecule results in the number of conformations for the molecule.<br />
<br />
<br />
<br />
Atom information: The atom information is divided into two parts. First is all atom information except or coordinates, followed by multiple sets of coordinates. Both parts of the atom information start with the hierarchy level.<br />
<br />
* Hierarchy level: The rigid fragment is 9. All groups attached to the rigid fragment are numbered in the 10's; groups attached to those are numbered in the 20's. The tens (and if need be the hundreds place) denotes distance from the rigid fragment. The ones place differentiates independent groups within the branch. Branches from the rigid fragment can be numbered either, 19, 18, etc., or all 19. Originally the each needed to be different, but as the code evolved this was no longer required.<br />
* Atom information: This block of information (line #12-23, #37, and #45-52) contains information we consider to be the same for all conformations. It also describes all of the parts (number of atoms each group) required to complete the branch. After the hierarchy information, the van der Waals type as described in the Dock 3.5 manual is listed. Next the partial atomic charge multiplied by 10000. This is historical; the multiplication should probably be removed at some point. At the end of the charge is a column that is set to zero. This value is the flag (see dock 2.5). Currently the hierarchy generator and dock do not use flagging. Next is the color, corresponding to the color table at the top of the database file. The last two columns are the polar and apolar partial atomic desolvation values.<br />
* Atom coordinates: The first number is the hierarchy level followed by the xyz coordinates for the atom (multiplied by 1000). Line #38-42 list six different sets of coordinates for the hydrogen described in line #37. Line #53-55 are the one set of coordinates for the two carbons and hydrogen in group 18. For each position of group 18 (line #53-55, #69-71, #85-87, and #101-103) there are three positions for group 29 (Cl and 2 H) and two positions for group 28, the carboxylic acid. Since groups 29 and 28 can move independently, they have different group numbers. For each of the four positions for group 18 there are six downstream combinations of 29 and 28. This leads to the reported 24 positions for the branch. Any of these positions can be combined with any of the six torsions of the hydrogen leading to 144 conformations for the molecule. Note that the information about how many different sets of coordinates is not encoded anywhere, you have to count the lines, when there is one atom and then 6 different sets of coordinates (like lines #38-42) you know there are 6 conformations for that atom. When there are 3 atoms and 9 lines of coordinates (like lines #56-64), you know there are 3 conformations for that set of atoms. Note also that no information about the tree is explicitly encoded, if you are reading from group 49 for instance and then read a line of group 29 you have to move back up the tree of conformations to that level. The output order of the tree of conformations is 'infix'.<br />
<br />
<br />
5. Molecule summary:<br />
<br />
D MFCD rigid flex I_ats I_confs O_ats O_confs O_hconfs<br />
<br />
D00000000 12 9 120 12 82 24 144<br />
<br />
MFCD: Id number from the mol2 file (format MFCD12345678 à D12345678)<br />
<br />
Rigid: Number of atoms in the same position in all conformations (12)<br />
<br />
Flex: Total number of atoms in the molecule minus the rigid atoms (9)<br />
<br />
I_ats: Input atoms -- the number of atoms required to represent the molecule in ensemble format. Rigid atoms plus the number of conformations times number of flexible atoms (120)<br />
<br />
I_confs: Number of conformations read in for the molecule (12)<br />
<br />
O_ats: Number of atoms written to the hierarchy -- usually less than I_ats unless lots of added hydrogen coordinates. (82)<br />
<br />
O_confs: Number of conformations, including recombination. This number does not include hydrogen conformations added by the hierarchy generator. This number will frequently be larger than I_confs (recombination), but can also be less than I_confs if the input conformations are degenerate. (24)<br />
<br />
O_hconfs: Number of conformations including hydrogens rotated by the hierarchy generator. This is the number of conformations dock will see. (144)<br />
<br />
<br />
<br />
Error messages<br />
<br />
no common atoms D00000000<br />
<br />
For the specified molecule id there were no atoms with common coordinates. Nothing is written to the database file<br />
<br />
Macrocycle error on D0000000<br />
<br />
The recursion routine used in generating the hierarchy has problems with macrocyles that are not the rigid fragment. Nothing is written to the database file.<br />
<br />
<br />
Here is the example file for the molecule shown above. The line numbers are shown at left, the column spacing is not accurate due to formatting changes.<br />
<code><pre> <br />
# 1 DOCK 5.1 ligand_atoms<br />
# 2 positive (1)<br />
# 3 negative (2)<br />
# 4 acceptor (3)<br />
# 5 donor (4)<br />
# 6 ester_o (5)<br />
# 7 amide_o (6)<br />
# 8 neutral (7)<br />
# 9 Family 1 1 2 2<br />
# 10 Molecule_to_describe_the_hierarchy_format D00000000 2 3<br />
# 11 12 12 8 4 -49.0700 1 -1.59 0 1<br />
# 12 9 1 1380 7 1.570 0.160<br />
# 13 9 1 -2410 7 -1.530 0.220<br />
# 14 9 1 -970 7 -0.510 0.230<br />
# 15 9 1 -1370 7 -2.080 0.260<br />
# 16 9 1 -380 7 -1.610 -0.070<br />
# 17 9 1 -1350 7 -4.130 0.220<br />
# 18 9 5 -560 7 -0.640 0.010<br />
# 19 912 -4710 4 -6.840 -1.350<br />
# 20 9 7 1190 7 1.100 -0.030<br />
# 21 9 7 1190 7 1.210 -0.030<br />
# 22 9 7 1180 7 1.790 -0.030<br />
# 23 9 7 1270 7 1.260 -0.030<br />
# 24 9 -1204 1354 0<br />
# 25 9 0 659 0<br />
# 26 9 0 -732 0<br />
# 27 9 -1204 -1427 0<br />
# 28 9 -2408 -732 0<br />
# 29 9 -2408 658 0<br />
# 30 9 -3866 -1528 -49<br />
# 31 9 -1237 2718 6<br />
# 32 9 919 1190 0<br />
# 33 9 919 -1263 0<br />
# 34 9 -1204 -2489 0<br />
# 35 9 -3327 1189 0<br />
# 36 6 1 0 1 -49.0700 1 -1.59 0 6<br />
# 37 19 6 3880 7 2.390 -0.620<br />
# 38 19 -328 3071 -166<br />
# 39 19 -937 3059 -874<br />
# 40 19 -1854 3036 -699<br />
# 41 19 -2161 3025 181<br />
# 42 19 -1552 3036 889<br />
# 43 19 -635 3059 714<br />
# 44 64 8 5 3 -49.0700 1 -1.59 0 24<br />
# 45 18 5 -100 7 0.290 0.460<br />
# 46 18 1 4520 7 14.160 0.650<br />
# 47 18 7 820 7 2.350 -0.010<br />
# 48 2916 -2080 7 -1.680 -0.080<br />
# 49 29 7 940 7 2.110 -0.020<br />
# 50 29 7 1210 7 1.780 -0.010<br />
# 51 2811 -6070 2 -26.310 0.240<br />
# 52 2811 -7570 2 -33.750 -1.760<br />
# 53 18 -3882 -3141 -97<br />
# 54 18 -4926 -1167 1143<br />
# 55 18 -4160 -1125 -1029<br />
# 56 29 -3131 -4036 1247<br />
# 57 29 -4935 -3454 -148<br />
# 58 29 -3261 -3392 -970<br />
# 59 29 -3084 -3955 -1465<br />
# 60 29 -3396 -3499 823<br />
# 61 29 -4946 -3397 -213<br />
# 62 29 -5457 -3970 -149<br />
# 63 29 -3325 -3445 -995<br />
# 64 29 -3448 -3445 867<br />
# 65 28 -5017 -86 1728<br />
# 66 28 -5869 -2094 1551<br />
# 67 28 -5397 -53 1380<br />
# 68 28 -5422 -2160 1970<br />
# 69 18 -5233 -672 -47<br />
# 70 18 -4072 -2582 -1283<br />
# 71 18 -3736 -2021 926<br />
# 72 29 -5511 482 -1375<br />
# 73 29 -6066 -1390 -59<br />
# 74 29 -5154 -39 849<br />
# 75 29 -5558 401 1337<br />
# 76 29 -5225 -44 -950<br />
# 77 29 -6027 -1431 7<br />
# 78 29 -6782 -1549 -100<br />
# 79 29 -5234 -64 870<br />
# 80 29 -5205 -115 -995<br />
# 81 28 -3182 -3226 -1841<br />
# 82 28 -5339 -2863 -1765<br />
# 83 28 -3377 -3574 -1512<br />
# 84 28 -5132 -2438 -2162<br />
# 85 18 -4582 -1948 1335<br />
# 86 18 -5060 -798 -895<br />
# 87 18 -3466 -2424 -546<br />
# 88 29 -5016 -663 2488<br />
# 89 29 -5510 -2477 1073<br />
# 90 29 -3821 -2541 1864<br />
# 91 29 -3714 -3043 2439<br />
# 92 29 -4786 -1021 1890<br />
# 93 29 -5463 -2526 1019<br />
# 94 29 -6143 -2801 1255<br />
# 95 29 -3882 -2601 1878<br />
# 96 29 -4821 -989 1817<br />
# 97 28 -4908 -64 -1874<br />
# 98 28 -6387 -1016 -571<br />
# 99 28 -5028 -515 -2095<br />
#100 28 -6262 -483 -287<br />
#101 18 -4533 -1865 -1479<br />
#102 18 -3939 -2951 755<br />
#103 18 -4429 -722 443<br />
#104 29 -3627 -2892 -2616<br />
#105 29 -5492 -2366 -1280<br />
#106 29 -4594 -890 -1985<br />
#107 29 -4929 -512 -2567<br />
#108 29 -3835 -2522 -2017<br />
#109 29 -5510 -2302 -1225<br />
#110 29 -6096 -2719 -1504<br />
#111 29 -4677 -908 -2003<br />
#112 29 -3831 -2571 -1946<br />
#113 28 -3290 -3248 1761<br />
#114 28 -4821 -3940 357<br />
#115 28 -3745 -3111 1962<br />
#116 28 -4291 -4115 96<br />
</pre></code><br />
<br />
[[Category:Articles needing style editing]]<br />
[[Category:Formats]]</div>Rgchttp://wiki.docking.org/index.php?title=Flexibase_Format&diff=2842Flexibase Format2012-01-19T19:25:49Z<p>Rgc: adding more info on colors</p>
<hr />
<div>[[Image:mol2db.gif|left|]] Hierarchy Generator<br />
<br />
<br />
<br />
Input: 1. multi-conformer mol2 file, 2. solvation file, and 3. inhier parameters<br />
<br />
Output: 4. database in hierarchy format, 5. molecule summary (stdout)<br />
<br />
<br />
<br />
Important notes: <br />
<br />
* The hierarchy generator does not know if hydrogens have been rotated. Turning on torque_hydrogens if hydrogens have already been rotated, will result in duplicate structures and inaccurate counting of conformations.<br />
* The hierarchy dock code will not read a database with hierarchy spacing. <br />
<br />
<br />
<br />
1. Multi-conformer mol2 file<br />
<br />
This file is a standard Tripos mol2 file. Multiple conformations of the same molecule must have the same MFCD number. The file must be under the UNIX file size limit of 2GB<br />
<br />
<br />
<br />
2. Solvation file<br />
<br />
This file contains abbreviated output from AMSOL. Each molecule record begins with a line containing the identification number followed by the total number of atoms and the formal charge. An important note is that the identification number here must match the identification number in the mol2 file! The next numbers are the total polar solvation energy, the total solvent accessible surface, total apolar solvation energy and total solvation energy. After this header, there is a line of data for each atom. The first number is the partial atomic charge, followed by the polar solvation energy, the solvent accessible surface area, apolor solvation energy and finally the total solvation energy of the atom.<br />
<br />
<br />
<br />
3. Hierarchy generator input parameters (inhier)<br />
<br />
The hierarchy generator is called by typing the path to the executable followed by the name of the input parameter file (can be anything, by default inhier). Logical keywords (procedures) can be yes/no or true/false.<br />
<br />
protein Is the input file a list of protein side chains or small molecules? Protein side chains do not allow for solvation values and preserve the input residue and atom names. If this option is true, solvation correction should be false; comment out the solvation_table line.<br />
<br />
equalize_charges This option adjusts formal charge and equalizes charges on equivalent groups. Set this value to No, as charges are corrected outside of the hierarchy generator. Code for this routine was modified from mol2db from the dock suite<br />
<br />
solvation-correction Should the hierarchy generator look for solvation data? If yes, molecules with solvation data will have their charges replaced with those from the solvation data table and have solvation data added. Molecules not listed in the solvation data table will retain their original charges, have zeros for the solvation numbers and have a -3 for the sixth value of the branch header line. The molecules with the -3 will be skipped by dock.<br />
<br />
color_atoms Should the generator put atom colors on each atoms? No reason to turn this option off. Code for this routine was copied from mol2db from the dock suite.<br />
<br />
output_color_table Should the database header be printed? When generating a small database (few hundred molecules) I leave this on. For large databases that will be joined together I leave it off. If this is left off, it needs to be added post database generation to the final joined files.<br />
<br />
translate_coordinates Should the coordinates be translated to the origin. Mol2db translated everything to the origin to keep the coordinates small (database spacing). This should no longer be a problem. The ACD can be generated with this set to No.<br />
<br />
hierarchy_spacing The hierarchy spacing option is only designed for visual inspection of the hierarchy. Dock will not read files with hierarchy spacing. For database generation, set this value to No.<br />
<br />
torque_hydrogens If the input conformations do not have multiple conformations for the hydrogens, this option will generate multiple conformations. Groups rotated include =NH, -SH, and -OH. They are rotated in 180°, 30°, and 30° increments respectively. If the -SH or OH are connected to an aromatic system they are rotated in 60° increments.<br />
<br />
mol2_file_list The hierarchy generator can process multiple files (gzipped or not). I recommend commenting this line out and calling the generator for each file form a shell script. This option is not compatible with mol2_file<br />
<br />
mol2_file This is the multi-conformer mol2 file used in database generation. This option is not compatible with mol2_file_list<br />
<br />
db_file This is the output file to which the generator writes the hierarchy.<br />
<br />
solvation_table The file from which the generator reads solvation and charge data.<br />
<br />
color_table This keyword marks the beginning of the color table. The color table follows the syntax rules for the mol2db (dock 3.5) color table. The end of the color table is marked by the keyword default_color and the value neutral<br />
<br />
Here are some additional notes on the color table, note that the atom names are sybyl atom names http://tripos.com/mol2/atom_types.html :<br />
<br />
rules are last match counts. so even though all Ns are positive, the later<br />
rule N.ar matches acceptor so N.ar is an acceptor<br />
first rules are beginning of sybyl atom text -> type<br />
later rules are Atom NotBondedTo Atom -> type like<br />
O. -1 N.2 -> negative<br />
other rules are Atom BondsAwayFrom Atom -> type like<br />
C.2 1 N. -> positive or<br />
0.2 2 N.3 -> amide_o<br />
again rules are read in order and the last matching rule is the one used<br />
<br />
<br />
4. Hierarchy database format<br />
<br />
The hierarchy database format is based on the dock 3.5 database format. The first lines include a header (line #1) and describe the colors used in the database (line #2-8). For the purposes of this discussion, the lines have been numbered (#1, #2, etc.) and the hierarchy has been indented to help distinguish the levels.<br />
<br />
<br />
<br />
Family header: Line #9 is designed for future use. The word Family is followed by a chemical family number. This is hard coded to 1 (first number), but will be changed in the future to group molecules into families. The second number is the number of molecules in this family (matches the number of occurrences of line #10), the third is the number of branches that get attached to each molecule. The last number is the number of branches in the family. Line #10 has the first 50 characters of the molecule name and the last letter and number of the identification code (e.g., MFCD à D, SPEC à C). After a space there can be up to 10 branches listed (10i2). The rigid fragment is considered branch one so the example molecule has branches two and three. By listed which branches make up which molecules, I can later rebuild each molecule and recombine side chains. Any branch in the first position can be combined with any branch in the second position, etc.<br />
<br />
<br />
<br />
Branches: Lines #11, #36, and #44 are branch identifiers. The first number in the first line of a branch lists how many coordinates are listed for the branch. The second number is the number of atoms in the branch (single conformer). This is followed by the number of heavy atoms and number of hydrogens in the branch. Next is the sum of the polar solvation energy for the entire molecule (not just the branch). Next is the number one, or if the molecule lacks solvation energy, -3. A value of -3 causes dock to skip the molecule. Next is the aploar component of desolvation, again for the entire molecule. The number zero was to denote the number of explicit conformations, but is now an open field for future use. The last number of the branch identifier line is the number of conformations for that branch (including recombination within the branch). Multiplying all of these conformation counts together for a given molecule results in the number of conformations for the molecule.<br />
<br />
<br />
<br />
Atom information: The atom information is divided into two parts. First is all atom information except or coordinates, followed by multiple sets of coordinates. Both parts of the atom information start with the hierarchy level.<br />
<br />
* Hierarchy level: The rigid fragment is 9. All groups attached to the rigid fragment are numbered in the 10's; groups attached to those are numbered in the 20's. The tens (and if need be the hundreds place) denotes distance from the rigid fragment. The ones place differentiates independent groups within the branch. Branches from the rigid fragment can be numbered either, 19, 18, etc., or all 19. Originally the each needed to be different, but as the code evolved this was no longer required.<br />
* Atom information: This block of information (line #12-23, #37, and #45-52) contains information we consider to be the same for all conformations. It also describes all of the parts (number of atoms each group) required to complete the branch. After the hierarchy information, the van der Waals type as described in the Dock 3.5 manual is listed. Next the partial atomic charge multiplied by 10000. This is historical; the multiplication should probably be removed at some point. At the end of the charge is a column that is set to zero. This value is the flag (see dock 2.5). Currently the hierarchy generator and dock do not use flagging. Next is the color, corresponding to the color table at the top of the database file. The last two columns are the polar and apolar partial atomic desolvation values.<br />
* Atom coordinates: The first number is the hierarchy level followed by the xyz coordinates for the atom (multiplied by 1000). Line #38-42 list six different sets of coordinates for the hydrogen described in line #37. Line #53-55 are the one set of coordinates for the two carbons and hydrogen in group 18. For each position of group 18 (line #53-55, #69-71, #85-87, and #101-103) there are three positions for group 29 (Cl and 2 H) and two positions for group 28, the carboxylic acid. Since groups 29 and 28 can move independently, they have different group numbers. For each of the four positions for group 18 there are six downstream combinations of 29 and 28. This leads to the reported 24 positions for the branch. Any of these positions can be combined with any of the six torsions of the hydrogen leading to 144 conformations for the molecule. Note that the information about how many different sets of coordinates is not encoded anywhere, you have to count the lines, when there is one atom and then 6 different sets of coordinates (like lines #38-42) you know there are 6 conformations for that atom. When there are 3 atoms and 9 lines of coordinates (like lines #56-64), you know there are 3 conformations for that set of atoms. Note also that no information about the tree is explicitly encoded, if you are reading from group 49 for instance and then read a line of group 29 you have to move back up the tree of conformations to that level. The output order of the tree of conformations is 'infix'.<br />
<br />
<br />
5. Molecule summary:<br />
<br />
D MFCD rigid flex I_ats I_confs O_ats O_confs O_hconfs<br />
<br />
D00000000 12 9 120 12 82 24 144<br />
<br />
MFCD: Id number from the mol2 file (format MFCD12345678 à D12345678)<br />
<br />
Rigid: Number of atoms in the same position in all conformations (12)<br />
<br />
Flex: Total number of atoms in the molecule minus the rigid atoms (9)<br />
<br />
I_ats: Input atoms -- the number of atoms required to represent the molecule in ensemble format. Rigid atoms plus the number of conformations times number of flexible atoms (120)<br />
<br />
I_confs: Number of conformations read in for the molecule (12)<br />
<br />
O_ats: Number of atoms written to the hierarchy -- usually less than I_ats unless lots of added hydrogen coordinates. (82)<br />
<br />
O_confs: Number of conformations, including recombination. This number does not include hydrogen conformations added by the hierarchy generator. This number will frequently be larger than I_confs (recombination), but can also be less than I_confs if the input conformations are degenerate. (24)<br />
<br />
O_hconfs: Number of conformations including hydrogens rotated by the hierarchy generator. This is the number of conformations dock will see. (144)<br />
<br />
<br />
<br />
Error messages<br />
<br />
no common atoms D00000000<br />
<br />
For the specified molecule id there were no atoms with common coordinates. Nothing is written to the database file<br />
<br />
Macrocycle error on D0000000<br />
<br />
The recursion routine used in generating the hierarchy has problems with macrocyles that are not the rigid fragment. Nothing is written to the database file.<br />
<br />
<br />
Here is the example file for the molecule shown above. The line numbers are shown at left, the column spacing is not accurate due to formatting changes.<br />
<code><pre> <br />
# 1 DOCK 5.1 ligand_atoms<br />
# 2 positive (1)<br />
# 3 negative (2)<br />
# 4 acceptor (3)<br />
# 5 donor (4)<br />
# 6 ester_o (5)<br />
# 7 amide_o (6)<br />
# 8 neutral (7)<br />
# 9 Family 1 1 2 2<br />
# 10 Molecule_to_describe_the_hierarchy_format D00000000 2 3<br />
# 11 12 12 8 4 -49.0700 1 -1.59 0 1<br />
# 12 9 1 1380 7 1.570 0.160<br />
# 13 9 1 -2410 7 -1.530 0.220<br />
# 14 9 1 -970 7 -0.510 0.230<br />
# 15 9 1 -1370 7 -2.080 0.260<br />
# 16 9 1 -380 7 -1.610 -0.070<br />
# 17 9 1 -1350 7 -4.130 0.220<br />
# 18 9 5 -560 7 -0.640 0.010<br />
# 19 912 -4710 4 -6.840 -1.350<br />
# 20 9 7 1190 7 1.100 -0.030<br />
# 21 9 7 1190 7 1.210 -0.030<br />
# 22 9 7 1180 7 1.790 -0.030<br />
# 23 9 7 1270 7 1.260 -0.030<br />
# 24 9 -1204 1354 0<br />
# 25 9 0 659 0<br />
# 26 9 0 -732 0<br />
# 27 9 -1204 -1427 0<br />
# 28 9 -2408 -732 0<br />
# 29 9 -2408 658 0<br />
# 30 9 -3866 -1528 -49<br />
# 31 9 -1237 2718 6<br />
# 32 9 919 1190 0<br />
# 33 9 919 -1263 0<br />
# 34 9 -1204 -2489 0<br />
# 35 9 -3327 1189 0<br />
# 36 6 1 0 1 -49.0700 1 -1.59 0 6<br />
# 37 19 6 3880 7 2.390 -0.620<br />
# 38 19 -328 3071 -166<br />
# 39 19 -937 3059 -874<br />
# 40 19 -1854 3036 -699<br />
# 41 19 -2161 3025 181<br />
# 42 19 -1552 3036 889<br />
# 43 19 -635 3059 714<br />
# 44 64 8 5 3 -49.0700 1 -1.59 0 24<br />
# 45 18 5 -100 7 0.290 0.460<br />
# 46 18 1 4520 7 14.160 0.650<br />
# 47 18 7 820 7 2.350 -0.010<br />
# 48 2916 -2080 7 -1.680 -0.080<br />
# 49 29 7 940 7 2.110 -0.020<br />
# 50 29 7 1210 7 1.780 -0.010<br />
# 51 2811 -6070 2 -26.310 0.240<br />
# 52 2811 -7570 2 -33.750 -1.760<br />
# 53 18 -3882 -3141 -97<br />
# 54 18 -4926 -1167 1143<br />
# 55 18 -4160 -1125 -1029<br />
# 56 29 -3131 -4036 1247<br />
# 57 29 -4935 -3454 -148<br />
# 58 29 -3261 -3392 -970<br />
# 59 29 -3084 -3955 -1465<br />
# 60 29 -3396 -3499 823<br />
# 61 29 -4946 -3397 -213<br />
# 62 29 -5457 -3970 -149<br />
# 63 29 -3325 -3445 -995<br />
# 64 29 -3448 -3445 867<br />
# 65 28 -5017 -86 1728<br />
# 66 28 -5869 -2094 1551<br />
# 67 28 -5397 -53 1380<br />
# 68 28 -5422 -2160 1970<br />
# 69 18 -5233 -672 -47<br />
# 70 18 -4072 -2582 -1283<br />
# 71 18 -3736 -2021 926<br />
# 72 29 -5511 482 -1375<br />
# 73 29 -6066 -1390 -59<br />
# 74 29 -5154 -39 849<br />
# 75 29 -5558 401 1337<br />
# 76 29 -5225 -44 -950<br />
# 77 29 -6027 -1431 7<br />
# 78 29 -6782 -1549 -100<br />
# 79 29 -5234 -64 870<br />
# 80 29 -5205 -115 -995<br />
# 81 28 -3182 -3226 -1841<br />
# 82 28 -5339 -2863 -1765<br />
# 83 28 -3377 -3574 -1512<br />
# 84 28 -5132 -2438 -2162<br />
# 85 18 -4582 -1948 1335<br />
# 86 18 -5060 -798 -895<br />
# 87 18 -3466 -2424 -546<br />
# 88 29 -5016 -663 2488<br />
# 89 29 -5510 -2477 1073<br />
# 90 29 -3821 -2541 1864<br />
# 91 29 -3714 -3043 2439<br />
# 92 29 -4786 -1021 1890<br />
# 93 29 -5463 -2526 1019<br />
# 94 29 -6143 -2801 1255<br />
# 95 29 -3882 -2601 1878<br />
# 96 29 -4821 -989 1817<br />
# 97 28 -4908 -64 -1874<br />
# 98 28 -6387 -1016 -571<br />
# 99 28 -5028 -515 -2095<br />
#100 28 -6262 -483 -287<br />
#101 18 -4533 -1865 -1479<br />
#102 18 -3939 -2951 755<br />
#103 18 -4429 -722 443<br />
#104 29 -3627 -2892 -2616<br />
#105 29 -5492 -2366 -1280<br />
#106 29 -4594 -890 -1985<br />
#107 29 -4929 -512 -2567<br />
#108 29 -3835 -2522 -2017<br />
#109 29 -5510 -2302 -1225<br />
#110 29 -6096 -2719 -1504<br />
#111 29 -4677 -908 -2003<br />
#112 29 -3831 -2571 -1946<br />
#113 28 -3290 -3248 1761<br />
#114 28 -4821 -3940 357<br />
#115 28 -3745 -3111 1962<br />
#116 28 -4291 -4115 96<br />
</pre></code><br />
<br />
[[Category:Articles needing style editing]]<br />
[[Category:Formats]]</div>Rgchttp://wiki.docking.org/index.php?title=Mol2db2_Format_2&diff=3777Mol2db2 Format 22012-01-17T18:41:25Z<p>Rgc: adding f95 statements</p>
<hr />
<div>This page is a wishlist for features that would be nice for a new version of the flexibase file format to support. mol2db2 format features that are actually implemented so far are marked [x]<br />
<br />
*Real Atom Types and Bond Information [x]<br />
*Way to determine which mix-and-match conformations have clashes (and avoid trying them) [x]<br />
*A place to store an internal energy for each possible conformation [x]<br />
*Terminal hydrogen rotations?? [x]<br />
*Per-conformation per-atom partial charge & solvation information to support internal energies<br />
*Aliphatic ring movements?<br />
*support for clusters of conformations [x]<br />
*group tagging (needed for covalent docking) and basic set of covalent groups<br />
*specified rigid component override (and better rules for finding non-ring rigid components)<br />
*per molecule pKa<br />
*arbitrary information to be written into output mol2 file (5th and above M lines) [x]<br />
<br />
the following represents the current plan for the file format<br />
*T type information (implicitly assumed)<br />
*M molecule (4 lines req'd, after that they are optional, 24 lines max)<br />
*A atoms<br />
*B bond<br />
*X xyz <br />
*R rigid xyz for matching (can actually be any xyzs) <br />
*C conformation<br />
*S sets<br />
*D clusters<br />
*E end of molecule<br />
<br />
T ## namexxxx (implicitly assumed to be the standard 7)<br />
M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters<br />
M charge polar_solv apolar_solv total_solv surface_area<br />
M smiles<br />
M longname<br />
[M arbitrary information preserved for writing out]<br />
A stuff about each atom, 1 per line <br />
B stuff about each bond, 1 per line<br />
X coordnum atomnum confnum x y z <br />
R rigidnum color x y z<br />
C confnum coordstart coordend<br />
S setnum #lines #confs_total broken hydrogens omega_energy<br />
S setnum linenum #confs confs [until full column]<br />
D clusternum setstart setend matchstart matchend #additionalmatching<br />
D matchnum color x y z<br />
E <br />
<br />
With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.<br />
<br />
notes: 17 children groups/group per line in current scheme.<br />
9 children confs/group per line.<br />
9 children confs/conf per line.<br />
8 confs/set per line.<br />
groups/confs with no children are written out.<br />
<br />
on the atom line, dt is dock type and co is color.<br />
<br />
1 2 3 4 5 6 7<br />
01234567890123456789012345678901234567890123456789012345678901234567890123456789<br />
T ## typename<br />
M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU<br />
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
[M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]<br />
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
B NUM ATO ATO TY<br />
X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
C CONFNO COORDSTAR COORDENDX<br />
S SETIDX #LINES #CO C H +ENERGY.XXX<br />
S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS<br />
D CLUSID STASET ENDSET MST MEN ADD<br />
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
E<br />
<br />
the type lines following are assumed by dock unless overriden:<br />
T 1 positive<br />
T 2 negative<br />
T 3 acceptor<br />
T 4 donor<br />
T 5 ester_o<br />
T 6 amide_o<br />
T 7 neutral<br />
<br />
the following are the format statements for python for each line<br />
T %2d %8s\n<br />
M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n<br />
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
M %77s\n<br />
M %77s\n<br />
M %77s\n<br />
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
B %3d %3d %3d %-2s\n<br />
X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n<br />
R %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
C %6d %9d %9d\n<br />
S %6d %6d %3d %1d %1d %+11.3f\n<br />
S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n <br />
D %6d %6d %6d %3d %3d %3d\n<br />
D %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
E\n<br />
<br />
The following are the fortran77 format statements<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
1000 format(2x,i2,1x,a8)<br />
!M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters<br />
2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)<br />
!M smiles or longname<br />
2200 format(2x,a77)<br />
!A stuff about each atom, 1 per line<br />
3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,<br />
& f10.3,1x,f10.3,1x,f9.3)<br />
!B stuff about each bond, 1 per line<br />
4000 format(2x,i3,1x,i3,1x,i3,1x,a2)<br />
!X atomnum confnum x y z<br />
5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)<br />
!R rigidnum color x y z<br />
6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)<br />
!C confnum #startcoord #endcoord<br />
7000 format(2x,i6,1x,i9,1x,i9)<br />
!S setnum #lines #confs_total broken hydrogens omega_energy<br />
8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)<br />
!S setnum linenum #confs confs [until full column]<br />
8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,<br />
& 1x,i6,1x,i6,1x,i6,1x,i6)<br />
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN<br />
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)<br />
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
!re-use 6000<br />
!E<br />
!E does not get a format line<br />
<br />
The following are Fortran95 format statements:<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000<br />
!M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters<br />
character (len=*), parameter :: DB2M1 =<br />
& '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
character (len=*), parameter :: DB2M2 =<br />
& '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100<br />
!M smiles/longname/arbitrary<br />
character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200<br />
!A stuff about each atom, 1 per line<br />
character (len=*), parameter :: DB2ATOM =<br />
& '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x,<br />
& f10.3,x,f10.3,x,f9.3)' !3000<br />
!B stuff about each bond, 1 per line<br />
character (len=*), parameter :: DB2BOND =<br />
& '(2x,i3,x,i3,x,i3,x,a2)' !4000<br />
!X coordnumx atomnum confnum x y z<br />
character (len=*), parameter :: DB2COORD =<br />
& '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000<br />
!R rigidnum color x y z<br />
character (len=*), parameter :: DB2RIGID =<br />
& '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000<br />
!C confnum coordstart coordend<br />
character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000<br />
!S setnum #lines #confs_total broken hydrogens omega_energy <br />
character (len=*), parameter :: DB2SET1 =<br />
& '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000<br />
!S setnum linenum #confs confs [until full column]<br />
character (len=*), parameter :: DB2SET2 =<br />
& '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6,<br />
& 1x,i6,x,i6,x,i6,x,i6)' !8100<br />
!D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d)<br />
character (len=*), parameter :: DB2CLUSTER =<br />
& '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000<br />
!D NUM CO x y z<br />
!reuse DB2RIGID<br />
!E<br />
!E does not get a format line <br />
<br />
[[Category:Wishlists]]</div>Rgchttp://wiki.docking.org/index.php?title=Mol2db2_Format_2&diff=3776Mol2db2 Format 22012-01-09T20:17:23Z<p>Rgc: </p>
<hr />
<div>This page is a wishlist for features that would be nice for a new version of the flexibase file format to support. mol2db2 format features that are actually implemented so far are marked [x]<br />
<br />
*Real Atom Types and Bond Information [x]<br />
*Way to determine which mix-and-match conformations have clashes (and avoid trying them) [x]<br />
*A place to store an internal energy for each possible conformation [x]<br />
*Terminal hydrogen rotations?? [x]<br />
*Per-conformation per-atom partial charge & solvation information to support internal energies<br />
*Aliphatic ring movements?<br />
*support for clusters of conformations [x]<br />
*group tagging (needed for covalent docking) and basic set of covalent groups<br />
*specified rigid component override (and better rules for finding non-ring rigid components)<br />
*per molecule pKa<br />
*arbitrary information to be written into output mol2 file (5th and above M lines) [x]<br />
<br />
the following represents the current plan for the file format<br />
*T type information (implicitly assumed)<br />
*M molecule (4 lines req'd, after that they are optional, 24 lines max)<br />
*A atoms<br />
*B bond<br />
*X xyz <br />
*R rigid xyz for matching (can actually be any xyzs) <br />
*C conformation<br />
*S sets<br />
*D clusters<br />
*E end of molecule<br />
<br />
T ## namexxxx (implicitly assumed to be the standard 7)<br />
M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters<br />
M charge polar_solv apolar_solv total_solv surface_area<br />
M smiles<br />
M longname<br />
[M arbitrary information preserved for writing out]<br />
A stuff about each atom, 1 per line <br />
B stuff about each bond, 1 per line<br />
X coordnum atomnum confnum x y z <br />
R rigidnum color x y z<br />
C confnum coordstart coordend<br />
S setnum #lines #confs_total broken hydrogens omega_energy<br />
S setnum linenum #confs confs [until full column]<br />
D clusternum setstart setend matchstart matchend #additionalmatching<br />
D matchnum color x y z<br />
E <br />
<br />
With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.<br />
<br />
notes: 17 children groups/group per line in current scheme.<br />
9 children confs/group per line.<br />
9 children confs/conf per line.<br />
8 confs/set per line.<br />
groups/confs with no children are written out.<br />
<br />
on the atom line, dt is dock type and co is color.<br />
<br />
1 2 3 4 5 6 7<br />
01234567890123456789012345678901234567890123456789012345678901234567890123456789<br />
T ## typename<br />
M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU<br />
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
[M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]<br />
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
B NUM ATO ATO TY<br />
X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
C CONFNO COORDSTAR COORDENDX<br />
S SETIDX #LINES #CO C H +ENERGY.XXX<br />
S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS<br />
D CLUSID STASET ENDSET MST MEN ADD<br />
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
E<br />
<br />
the type lines following are assumed by dock unless overriden:<br />
T 1 positive<br />
T 2 negative<br />
T 3 acceptor<br />
T 4 donor<br />
T 5 ester_o<br />
T 6 amide_o<br />
T 7 neutral<br />
<br />
the following are the format statements for python for each line<br />
T %2d %8s\n<br />
M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n<br />
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
M %77s\n<br />
M %77s\n<br />
M %77s\n<br />
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
B %3d %3d %3d %-2s\n<br />
X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n<br />
R %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
C %6d %9d %9d\n<br />
S %6d %6d %3d %1d %1d %+11.3f\n<br />
S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n <br />
D %6d %6d %6d %3d %3d %3d\n<br />
D %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
E\n<br />
<br />
The following are the fortran format statements<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
1000 format(2x,i2,1x,a8)<br />
!M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters<br />
2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)<br />
!M smiles or longname<br />
2200 format(2x,a77)<br />
!A stuff about each atom, 1 per line<br />
3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,<br />
& f10.3,1x,f10.3,1x,f9.3)<br />
!B stuff about each bond, 1 per line<br />
4000 format(2x,i3,1x,i3,1x,i3,1x,a2)<br />
!X atomnum confnum x y z<br />
5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)<br />
!R rigidnum color x y z<br />
6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)<br />
!C confnum #startcoord #endcoord<br />
7000 format(2x,i6,1x,i9,1x,i9)<br />
!S setnum #lines #confs_total broken hydrogens omega_energy<br />
8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)<br />
!S setnum linenum #confs confs [until full column]<br />
8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,<br />
& 1x,i6,1x,i6,1x,i6,1x,i6)<br />
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN<br />
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)<br />
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
!re-use 6000<br />
!E<br />
!E does not get a format line<br />
<br />
[[Category:Wishlists]]</div>Rgchttp://wiki.docking.org/index.php?title=Chemoinformatics_Waiver_Wire&diff=383Chemoinformatics Waiver Wire2012-01-06T18:38:22Z<p>Rgc: </p>
<hr />
<div>Henry still needs to make this page as well</div>Rgchttp://wiki.docking.org/index.php?title=How_to_compile_DOCK&diff=3208How to compile DOCK2012-01-04T20:34:30Z<p>Rgc: other problems found</p>
<hr />
<div>This is for the Shoichet Lab local version of DOCK 3.5.54 trunk. <br />
<br />
'''Checking out the source files'''<br />
<br />
Commands:<br />
csh<br />
mkdir /where/to/put<br />
cd /where/to/put<br />
svn checkout file:///raid4/svn/dock<br />
svn checkout file:///raid4/svn/libfgz<br />
<br />
'''Compiling the program on our cluster'''<br />
<br />
First, you need to set the path to the PGF compiler by adding this line to your .login file (at the end):<br />
<br />
setenv DOCK_BASE ~xyz/dockenv<br />
echo DOCK_BASE set to $DOCK_BASE.<br />
source $DOCK_BASE/etc/login<br />
<br />
When you login to sgehead now, you should see the "Enabling pgf compiler" message<br />
<br />
Commands:<br />
ssh sgehead<br />
cd /where/to/put/libfgz/trunk<br />
make<br />
<br />
Since we still have some 32bit computers, you'll also want to do<br />
make SIZE=32<br />
before leaving the libfgz branch and going to DOCK:<br />
<br />
cd ../../dock/trunk/i386<br />
make<br />
<br />
This makes the 64 bit version. Some options:<br />
<br />
make SIZE=32<br />
<br />
Makes the 32bit version, useful for running on the cluster since some machines are older.<br />
<br />
make DEBUG=1 <br />
<br />
Makes a debug version that will report line numbers of errors and is usable with pgdbg (the Portland Group Debugger), which is useful when writing code but is 10x (or more) slower.<br />
<br />
'''Compiling the program on the shared QB3 cluster'''<br />
<br />
On one of the compilation nodes on the shared QB3 cluster (optint1 or optint2):<br />
<br />
ssh optint2<br />
cd /where/to/put/libfgz/trunk<br />
cp Makefile Makefile.old<br />
modify Makefile:<br />
uncomment the following:<br />
FC = ifort -O3<br />
CC = icc -O3<br />
make<br />
cd ../../dock/trunk/i386<br />
cp Makefile Makefile.old<br />
modify Makefile<br />
uncomment the following:<br />
F77 = ifort<br />
FFLAGS = -O3 -convert big_endian<br />
make dock<br />
<br />
[[Category:Tutorials]]</div>Rgchttp://wiki.docking.org/index.php?title=How_to_compile_DOCK&diff=3207How to compile DOCK2012-01-04T17:17:57Z<p>Rgc: dock_base setup to compile</p>
<hr />
<div>This is for the Shoichet Lab local version of DOCK 3.5.54 trunk. <br />
<br />
'''Checking out the source files'''<br />
<br />
Commands:<br />
csh<br />
mkdir /where/to/put<br />
cd /where/to/put<br />
svn checkout file:///raid4/svn/dock<br />
svn checkout file:///raid4/svn/libfgz<br />
<br />
'''Compiling the program on our cluster'''<br />
<br />
First, you need to set the path to the PGF compiler by adding this line to your .login file (at the end):<br />
<br />
setenv DOCK_BASE ~xyz/dockenv<br />
echo DOCK_BASE set to $DOCK_BASE.<br />
source $DOCK_BASE/etc/login<br />
<br />
When you login to sgehead now, you should see the "Enabling pgf compiler" message<br />
<br />
Commands:<br />
ssh sgehead<br />
cd /where/to/put/libfgz/trunk<br />
make<br />
cd ../../dock/trunk/i386<br />
make<br />
<br />
'''Compiling the program on the shared QB3 cluster'''<br />
<br />
On one of the compilation nodes on the shared QB3 cluster (optint1 or optint2):<br />
<br />
ssh optint2<br />
cd /where/to/put/libfgz/trunk<br />
cp Makefile Makefile.old<br />
modify Makefile:<br />
uncomment the following:<br />
FC = ifort -O3<br />
CC = icc -O3<br />
make<br />
cd ../../dock/trunk/i386<br />
cp Makefile Makefile.old<br />
modify Makefile<br />
uncomment the following:<br />
F77 = ifort<br />
FFLAGS = -O3 -convert big_endian<br />
make dock<br />
<br />
[[Category:Tutorials]]</div>Rgchttp://wiki.docking.org/index.php?title=How_to_compile_DOCK&diff=3206How to compile DOCK2012-01-04T17:15:46Z<p>Rgc: </p>
<hr />
<div>This is for the Shoichet Lab local version of DOCK 3.5.54 trunk. <br />
<br />
'''Checking out the source files'''<br />
<br />
Commands:<br />
csh<br />
mkdir /where/to/put<br />
cd /where/to/put<br />
svn checkout file:///raid4/svn/dock<br />
svn checkout file:///raid4/svn/libfgz<br />
<br />
'''Compiling the program on our cluster'''<br />
<br />
First, you need to set the path to the PGF compiler by adding this line to your .login file (at the end):<br />
<br />
set path = ( /raid3/software/pgi/9.0.4/linux86-64/9.0-4/bin $path )<br />
<br />
When you login to sgehead now, you should see the "Enabling pgf compiler" message<br />
<br />
Commands:<br />
ssh sgehead<br />
cd /where/to/put/libfgz/trunk<br />
make<br />
cd ../../dock/trunk/i386<br />
make<br />
<br />
'''Compiling the program on the shared QB3 cluster'''<br />
<br />
On one of the compilation nodes on the shared QB3 cluster (optint1 or optint2):<br />
<br />
ssh optint2<br />
cd /where/to/put/libfgz/trunk<br />
cp Makefile Makefile.old<br />
modify Makefile:<br />
uncomment the following:<br />
FC = ifort -O3<br />
CC = icc -O3<br />
make<br />
cd ../../dock/trunk/i386<br />
cp Makefile Makefile.old<br />
modify Makefile<br />
uncomment the following:<br />
F77 = ifort<br />
FFLAGS = -O3 -convert big_endian<br />
make dock<br />
<br />
[[Category:Tutorials]]</div>Rgchttp://wiki.docking.org/index.php?title=How_to_compile_DOCK&diff=3205How to compile DOCK2012-01-04T17:15:38Z<p>Rgc: </p>
<hr />
<div>This is for the Shoichet Lab local version of DOCK 3.5.54 trunk. <br />
<br />
'''Checking out the source files'''<br />
<br />
Commands:<br />
csh<br />
mkdir /where/to/put<br />
cd /where/to/put<br />
svn checkout file:///raid4/svn/dock<br />
svn checkout file:///raid4/svn/libfgz<br />
<br />
'''Compiling the program on our cluster'''<br />
<br />
First, you need to set the path to the PGF compiler by adding this line to your .login file (at the end):<br />
<br />
set path = ( /raid3/software/pgi/9.0.4/linux86-64/9.0-4/bin $path<br />
<br />
When you login to sgehead now, you should see the "Enabling pgf compiler" message<br />
<br />
Commands:<br />
ssh sgehead<br />
cd /where/to/put/libfgz/trunk<br />
make<br />
cd ../../dock/trunk/i386<br />
make<br />
<br />
'''Compiling the program on the shared QB3 cluster'''<br />
<br />
On one of the compilation nodes on the shared QB3 cluster (optint1 or optint2):<br />
<br />
ssh optint2<br />
cd /where/to/put/libfgz/trunk<br />
cp Makefile Makefile.old<br />
modify Makefile:<br />
uncomment the following:<br />
FC = ifort -O3<br />
CC = icc -O3<br />
make<br />
cd ../../dock/trunk/i386<br />
cp Makefile Makefile.old<br />
modify Makefile<br />
uncomment the following:<br />
F77 = ifort<br />
FFLAGS = -O3 -convert big_endian<br />
make dock<br />
<br />
[[Category:Tutorials]]</div>Rgchttp://wiki.docking.org/index.php?title=Qnifft_DOCK_3.6_conversion&diff=4210Qnifft DOCK 3.6 conversion2011-12-22T18:07:12Z<p>Rgc: </p>
<hr />
<div>Qnifft is a new option for use instead of DelPhi with DOCK 3.6 It is a poisson boltzmann solver program from Kim Sharp [[http://crystal.med.upenn.edu/software.html]]. It has been integrated into the [[DOCK Blaster]] and [[DOCK 3.6]] toolchain. For now if you make use of it, please cite:<br />
<br />
Sharp, K. A. 1995. Polyelectrolyte electrostatics: Salt dependence, entropic and enthalpic contributions to free energy in the nonlinear Poisson-Boltzmann model. Biopolymers 36:227-243. [http://10.1002/bip.360360210 10.1002/bip.360360210]<br />
<br />
and <br />
<br />
Gallagher, K., and K. A. Sharp. 1998. Electrostatic Contributions to Heat Capacity Changes of DNA-Ligand Binding. Biophys. J. 75:769-776.[http://dx.doi.org/10.1016/S0006-3495(98)77566-6 http://dx.doi.org/10.1016/S0006-3495(98)77566-6]<br />
<br />
<br />
== Using the new code ==<br />
<br />
A compiled qnifft binary is in $DOCK_BASE/bin/Linux/qnifft22_193_pgf_32<br />
<br />
Running qnifft requires setting your $DELDIR environment variable to $DOCK_BASE/src/qnifft<br />
<br />
The default way to run qnifft is to copy the qnifft.parm file from $DELDIR and run it by calling<br />
qnifft qnifft.parm<br />
<br />
If you're using DOCK Blaster, you can make the new electrostatic grids by typing:<br />
<br />
make grids/rec+sph.qnifft.phi<br />
<br />
or if you want to use the full DOCK Blaster toolchain you can type<br />
<br />
make autonew<br />
<br />
Once you have the new phimap, you have to edit your INDOCK to point to it instead of the old phimap (rec+sph.phi usually). Once you've done that, you also have to use the new DOCK executable located in $DOCK_BASE/bin/Linux/dock.csh The best way to use this is to use the following command instead of $mud/submit.csh:<br />
<br />
$mud/subdock.csh $DOCK_BASE/bin/Linux/dock.csh<br />
<br />
This should produce compatible OUTDOCK & test.eel1.gz files.<br />
<br />
== Recompiling DOCK 3.6 to use the new Qnifft grids ==<br />
<br />
If you're using a different version of DOCK 3.6 and want to change it to be compatible with Qnifft-produced grids, you only have to change one line. In max.h change <br />
<br />
parameter (nsize=179)<br />
<br />
to <br />
<br />
parameter (nsize=193)<br />
<br />
Then you have to run<br />
<br />
cd i386 ; make clean ; make ; make SIZE=32<br />
<br />
To produce new binaries for use with these grids.<br />
<br />
== INDOCK file ==<br />
<br />
delphi_file ../../grids/rec+sph.qnifft.phi<br />
delphi_nsize 193<br />
<br />
== Backwards compatibility with the old grids ==<br />
<br />
Edit your INDOCK file, add a delphi_nsize parameter and set it to 179. This allows use of old delphi grids instead of the 193 sized Qnifft grids.<br />
<br />
[[Category:DOCK]]</div>Rgchttp://wiki.docking.org/index.php?title=Qnifft_DOCK_3.6_conversion&diff=4209Qnifft DOCK 3.6 conversion2011-12-22T18:06:56Z<p>Rgc: </p>
<hr />
<div>Qnifft is a new option for use instead of DelPhi with DOCK 3.6 It is a poisson boltzmann solver program from Kim Sharp [[http://crystal.med.upenn.edu/software.html]]. It has been integrated into the [[DOCK Blaster]] and [[DOCK 3.6]] toolchain. For now if you make use of it, please cite:<br />
<br />
Sharp, K. A. 1995. Polyelectrolyte electrostatics: Salt dependence, entropic and enthalpic contributions to free energy in the nonlinear Poisson-Boltzmann model. Biopolymers 36:227-243. [http://10.1002/bip.360360210 10.1002/bip.360360210]<br />
<br />
and <br />
<br />
Gallagher, K., and K. A. Sharp. 1998. Electrostatic Contributions to Heat Capacity Changes of DNA-Ligand Binding. Biophys. J. 75:769-776.[http://dx.doi.org/10.1016/S0006-3495(98)77566-6 http://dx.doi.org/10.1016/S0006-3495(98)77566-6]<br />
<br />
<br />
== Using the new code ==<br />
<br />
A compiled qnifft binary is in $DOCK_BASE/bin/Linux/qnifft22_193_pgf_32<br />
<br />
Running qnifft requires setting your $DELDIR environment variable to $DOCK_BASE/src/qnifft<br />
<br />
The default way to run qnifft is to copy the qnifft.parm file from $DELDIR and run it by calling<br />
qnifft qnifft.parm<br />
<br />
If you're using DOCK Blaster, you can make the new electrostatic grids by typing:<br />
<br />
make grids/rec+sph.qnifft.phi<br />
<br />
or if you want to use the full DOCK Blaster toolchain you can type<br />
<br />
make autonew<br />
<br />
Once you have the new phimap, you have to edit your INDOCK to point to it instead of the old phimap (rec+sph.phi usually). Once you've done that, you also have to use the new DOCK executable located in $DOCK_BASE/bin/Linux/dock.csh The best way to use this is to use the following command instead of $mud/submit.csh:<br />
<br />
$mud/subdock.csh $DOCK_BASE/bin/Linux/dock3.6_qnifft/dock.csh<br />
<br />
This should produce compatible OUTDOCK & test.eel1.gz files.<br />
<br />
== Recompiling DOCK 3.6 to use the new Qnifft grids ==<br />
<br />
If you're using a different version of DOCK 3.6 and want to change it to be compatible with Qnifft-produced grids, you only have to change one line. In max.h change <br />
<br />
parameter (nsize=179)<br />
<br />
to <br />
<br />
parameter (nsize=193)<br />
<br />
Then you have to run<br />
<br />
cd i386 ; make clean ; make ; make SIZE=32<br />
<br />
To produce new binaries for use with these grids.<br />
<br />
== INDOCK file ==<br />
<br />
delphi_file ../../grids/rec+sph.qnifft.phi<br />
delphi_nsize 193<br />
<br />
== Backwards compatibility with the old grids ==<br />
<br />
Edit your INDOCK file, add a delphi_nsize parameter and set it to 179. This allows use of old delphi grids instead of the 193 sized Qnifft grids.<br />
<br />
[[Category:DOCK]]</div>Rgchttp://wiki.docking.org/index.php?title=DOCK_3.6&diff=596DOCK 3.62011-12-22T18:06:10Z<p>Rgc: </p>
<hr />
<div>DOCK 3.5.54 is a version of [[UCSF]] [[DOCK 3]] that was used, developed and maintained by the [[Shoichet Lab]] from 1998 to May 2010. All docking papers from the Shoichet Lab during that time used this program. It was also used by [[DOCK Blaster]] until May 2010. <br />
<br />
DOCK 3.5.54 was superseded by [[DOCK 3.6]] in May 2010. DOCK 3.5.54 is no longer available.<br />
<br />
* Original documentation as [[Image:Dock3 5refman.pdf | PDF ]]<br />
<br />
= List of publications using DOCK 3.5.54 =<br />
* [http://shoichetlab.compbio.ucsf.edu/publications.php Shoichet Lab Publications Page]<br />
<br />
* [[Install DOCK 3.5.54]]<br />
<br />
* [[INDOCK_for_DOCK_3.6]]<br />
<br />
DOCK 3.6 is the current version of the [[DOCK 3]] series of docking programs used and developed in the [[Shoichet Lab]]. DOCK 3.6 is the engine used by [[DOCK Blaster]]. A sample [[INDOCK for DOCK 3.6]] is available.<br />
<br />
= Release Date = <br />
DOCK 3.6 was released on April 31, 2010. <br />
<br />
{{TOCright}}<br />
<br />
= Contributors = <br />
* Michael Mysinger - innovations in the calculation of ligand desolvation, as described in Mysinger MM, Shoichet BK, Rapid context-dependent ligand desolvation in molecular docking, J Chem Inf Model. 2010 Sep 27;50(9):1561-73<br />
* Michael Carchia - code optimizations<br />
* Ryan Coleman - internal code cleanup (see below) <br />
* Niu Huang & lab - ligand clustering code (see below)<br />
* Ryan Coleman & Kim Sharp - [[Qnifft DOCK 3.6 conversion]] replacement for delphi<br />
<br />
= Ligand Desolvation = <br />
* Improved calculation of ligand desolvation, as described in Mysinger MM, Shoichet BK, Rapid context-dependent ligand desolvation in molecular docking, J Chem Inf Model. 2010 Sep 27;50(9):1561-73<br />
<br />
= Optimizations = <br />
* Speed up of 300 to 500% depending on platform and case by Michael Carchia. <br />
* First version of DOCK 3 series to use 64-bit and 32-bit versions<br />
<br />
= Clean up = <br />
* Fast out-of-bound checking, by Ryan Coleman<br />
* Internal clash checking inside DOCK, by Ryan Coleman<br />
<br />
= Ligand Clustering =<br />
* Replacement for Single Mode<br />
* Contributed by Niu Huang's lab. <br />
* See [[Dock Ligand Clustering]] page for more information<br />
<br />
= Qnifft =<br />
* Replacement for Delphi, still Poisson Boltzmann electrostatics<br />
* Contributed by Kim Sharp & Ryan Coleman<br />
* See [[Qnifft DOCK 3.6 conversion]] page for more information.<br />
* Please cite http://10.0.3.234/bip.360360210 and http://dx.doi.org/10.1016/S0006-3495(98)77566-6<br />
<br />
[[Category:Software]]<br />
[[Category:DOCK]]</div>Rgchttp://wiki.docking.org/index.php?title=CSD&diff=267CSD2011-12-20T22:45:37Z<p>Rgc: </p>
<hr />
<div>The Cambridge Structural Database contains crystallographic data on small organic molecules (much like the PDB). <br />
<br />
login to sgehead1 and run the following command to correctly initialize your environment:<br />
<br />
$ source /raid3/software/csd/current.csh<br />
<br />
You can then run Conquest by executing the command:<br />
<br />
$ cq<br />
<br />
This will be calling the file: /raid3/software/csd/current/bin/cq<br />
<br />
In case you are prompted for a licence for any reason, it is located in<br />
<br />
/raid3/software/csd/current/csd/csd_licence.dat<br />
<br />
If this fails with the error Tcl Error about no display, you may have to login using:<br />
<br />
ssh -Y sgehead<br />
<br />
Which forwards your X session.<br />
<br />
If you want help using ConQuest, try the documentation:<br />
<br />
http://www.ccdc.cam.ac.uk/support/documentation/conquest/ConQuest/toc.html</div>Rgchttp://wiki.docking.org/index.php?title=INDOCK_for_DOCK_3.6&diff=3359INDOCK for DOCK 3.62011-12-15T23:13:42Z<p>Rgc: </p>
<hr />
<div>What follows is a documented sample INDOCK file for [[DOCK 3.6]]. Many lines are required, lines starting with # are comments.<br />
<br />
'''NOTE: do not under any circumstances use tab characters in this file.'''<br />
<br />
Required first line:<br />
<br />
DOCK 3.5 parameter<br />
###############################################################################<br />
################## DOCK 3.5 INPUT PARAMETERS 2011/10/26 #######################<br />
###############################################################################<br />
###############################################################################<br />
# INPUT/OUTPUT<br />
#<br />
<br />
This is the path to the receptor matching spheres file. Most scripts make a set of directories and copy the INDOCK file into them, so this path sometimes has an extra set of "../" in it compared to what you might think. If you use [[DOCK Blaster]]. Generally, match3 has more spheres than match2, so produces more possible orientations. These spheres are matched to ligand spheres, generated from heavy atoms in the "rigid component" of each ligand. For more about the rigid component, see [[Flexibase Format]].<br />
<br />
receptor_sphere_file ../../sph/match2.sph<br />
<br />
The next line is always 1, and is marked for deprecation.<br />
<br />
cluster_numbers 1<br />
<br />
The next line refers to which ligand file to use. If using many of the automated scripts, split_database_index is used, as this allows many ligand files (or just 1) to be placed in the split_database_index file and read in one after another during a DOCK run. If docking small things on your own, you can change this to any file.<br />
<br />
# NOTE: split_database_index is reserved to specify a list of files<br />
ligand_atom_file split_database_index<br />
<br />
This will control the file output, again many of the automated scripts expect it to be test. OUTDOCK files are always named OUTDOCK.<br />
<br />
output_file_prefix test.<br />
<br />
This controls the random seed used in the minimization procedure. Changing this will produce slightly different results.<br />
<br />
random_seed 777<br />
#<br />
###############################################################################<br />
# MATCHING<br />
#<br />
<br />
distance_tolerance is how different the distances can be between a pair of receptor matching spheres and a pair of ligand matching spheres for them to still be considered matched.<br />
<br />
distance_tolerance 1.5<br />
<br />
This changes how many spheres must be matched to generate an orientation. 3 as a minimum, 4 as a maximum is generally accepted as the right thing to use. Less than 3 is too degenerate to generate an actual orientation, and requiring more than 4 matched spheres does not work well, since we only use heavy atoms in ring systems to generate ligand matching spheres.<br />
<br />
nodes_maximum 4<br />
nodes_minimum 3<br />
<br />
The next 4 parameters control how the histograms of distance differences are generated. The binsize is how big the bins are, the overlap controls if a sphere can be put into multiple bins. The ligand & receptor parameters are not required to be the same. <br />
<br />
ligand_binsize 0.4<br />
ligand_overlap 0.2<br />
receptor_binsize 0.4<br />
receptor_overlap 0.2<br />
<br />
Bumping is using a quick check of distances when placing ligand atoms in the binding site to determine if they have a steric clash. The maximum is how many can be 'bumped' or in close steric contact per rigid or flexible component of the ligand, as per the [[Flexibase Format]]. Even ligands with some steric clashes can sometimes be rescued by minimization. Setting this number very high will cause many clashed orientations to be scored, which can be prohibitively slow.<br />
<br />
bump_maximum 1<br />
<br />
The next four parameters are unused and unsupported.<br />
<br />
focus_cycles 0<br />
focus_bump 0 <br />
focus_type energy<br />
critical_clusters no<br />
#<br />
###############################################################################<br />
# COLORING<br />
#<br />
<br />
This controls whether chemical matching or coloring is used at all. If yes, many match lines are necessary. These may not be perfect, but [[DOCK Blaster]] has been using these for a long time. Setting this to no produces many more matched orientations, which can be slow, but can help you understand exactly what the energy function is doing.<br />
<br />
chemical_matching yes<br />
case_sensitive no<br />
# ligand color, receptor color<br />
match positive negative<br />
match positive negative_or_acceptor<br />
match positive not_neutral<br />
match negative positive<br />
match negative positive_or_donor<br />
match negative not_neutral<br />
match donor acceptor<br />
match donor donacc<br />
match donor negative_or_acceptor<br />
match donor neutral_or_acceptor_or_donor<br />
match donor not_neutral<br />
match acceptor donor<br />
match acceptor donacc<br />
match acceptor positive_or_donor<br />
match acceptor neutral_or_acceptor_or_donor<br />
match acceptor not_neutral<br />
match neutral neutral<br />
match neutral neutral_or_acceptor_or_donor<br />
match ester_o donor<br />
match ester_o donacc<br />
match ester_o positive_or_donor<br />
match ester_o not_neutral<br />
match amide_o donor<br />
match amide_o donacc<br />
match amide_o positive_or_donor<br />
match amide_o not_neutral<br />
<br />
Single mode is deprecated, these parameters won't work. See [[Dock Ligand Clustering]]<br />
#<br />
###############################################################################<br />
# SINGLE MODE<br />
#<br />
#rmsd_override 0.0<br />
#contact_minimum 0<br />
#energy_maximum 1.0e+6<br />
##truncate_output 1000.0<br />
#<br />
<br />
Search mode is now the default/only mode of docking. Each parameter is described below.<br />
<br />
###############################################################################<br />
# SEARCH MODE<br />
#<br />
<br />
The ratio_minimum parameter has been slated for deprecation.<br />
<br />
ratio_minimum 0.0<br />
<br />
These parameters control how many atoms are necessary in the ligand for it to be docked.<br />
<br />
atom_minimum 5 <br />
atom_maximum 100<br />
<br />
How many of the top molecules will be saved in the output test.* file. <br />
<br />
number_save 50000<br />
<br />
The maximum number of molecules that will be scored in any given run.<br />
<br />
molecules_maximum 300000 <br />
<br />
How many molecules will be skipped, this feature currently does not work.<br />
<br />
initial_skip 0<br />
<br />
How long a molecule is processed before quitting. This feature currently may not work as expected.<br />
<br />
timeout 180<br />
<br />
There are many scoring options:<br />
<br />
# <br />
###############################################################################<br />
# SCORING<br />
#<br />
<br />
Valid options for ligand_desolvation are 'volume' (partial desolvation a la Mysinger & Shoichet 2010), 'full' meaning that the entire ligand is assumed to be desolvated in the binding site and 'none', where no desolvation penalties are applied.<br />
<br />
ligand_desolvation volume<br />
<br />
See the note about relative paths for the matching spheres above, the same comments apply here. There are 2 ways to run 'volume' or partial desolvation, one is to use one grid for every ligand atom like this:<br />
<br />
solvmap_file ../../grids/solvmap_sev<br />
<br />
The other option is to use one grid for ligand heavy atoms and one for ligand hydrogen atoms, you'll want to uncomment these lines to use them (and comment out the other solvmap_file line).<br />
<br />
#solvmap_file ../../grids/solvmap_sev.heavy<br />
#hydrogen_solvmap_file ../../grids/solvmap.sev.hydrogen<br />
<br />
This is the phimap file used for electrostatic scoring. For a better understanding of this grid, see [[Visualizing delphi]]. Sometimes this will change if you are using the new Qnifft Delphi maps, see [[Qnifft DOCK 3.6 conversion]].<br />
<br />
delphi_file ../../grids/rec+sph.phi<br />
delphi_nsize 179<br />
<br />
This controls the chemgrid file, which contains the van der Waals scoring for every coordinate (chem.vdw will be called) as well as the distance map grids that will be used for deciphering bumping (chem.bmp will be called).<br />
<br />
chemgrid_file_prefix ../../grids/chem<br />
<br />
This is the parameter file that contains the atom type definitions:<br />
<br />
vdw_parameter_file ../../grids/vdw.parms.amb.mindock<br />
<br />
The following options allow the electrostatics and van der Waals parameters to be scaled relative to each other and the solvation scoring.<br />
<br />
electrostatic_scale 1.0<br />
vdw_scale 1.0<br />
<br />
The following parameter lets ligands with internal steric clashes attempt to find a ligand conformation that scores well but does not have any internal clashes. Sometimes this procedure will fail in circumstances where there are many flexible branches, or where a ligand that is too large for the binding site is being docked.<br />
<br />
check_clashes yes<br />
<br />
If set to yes, this removes the positive solvation from each ligand atom and spreads it evenly over the molecule. This is deprecated because it does unexpected things to solvation, and will be removed entirely soon.<br />
<br />
remove_positive_solvation no<br />
<br />
After each orientation of the rigid component is processed and the many ligand conformations have been examined, the best ligand conformation for that orientation can be minimized using the following parameters.<br />
<br />
#<br />
###############################################################################<br />
# MINIMIZATION<br />
#<br />
<br />
No turns off minimization completely.<br />
<br />
minimize yes<br />
<br />
Don't minimize molecules that score above the minimization_max.<br />
<br />
minimization_max 1.0e15<br />
<br />
If set to yes, this checks to see if the orientation has already been scored and quits. This has not been tested recently.<br />
<br />
check_degeneracy no<br />
<br />
How many iterations of minimization to do. More means longer run times, but potentially better poses.<br />
<br />
simplex_iterations 250<br />
<br />
How much the total energy can changed to be considered converged. Setting this higher will stop faster, setting it lower will cause it to do more iterations before converging (or potentially hitting the iteration max above).<br />
<br />
simplex_convergence 0.1<br />
<br />
If the energy changes by this much, restart the minimizer from this newest position.<br />
<br />
simplex_restart 1.0<br />
<br />
This is the initial distance in angstroms the molecule is translated (note that translation and rotation used to be swapped for many releases of DOCK).<br />
<br />
simplex_initial_translation 0.2<br />
<br />
How many degrees of initial rotation are done.<br />
<br />
simplex_initial_rotation 5.0<br />
#<br />
###############################################################################<br />
###############################################################################</div>Rgc