Automated Database Preparation: Difference between revisions

Revision as of 21:17, 29 October 2008

Automated Docking Database Tools

#Automatic Database Generation: You want to generate your own hierarchy databases as ligand inputs to DOCK 3.5
#Automatic Decoy Generation: You want to generate DUD style decoys from your set of input ligands

Automatic Database Generation

Most scripts in this section are automatically put in your path by the DOCK login scripts. If they are not, then inside a csh first do the following:

setenv DOCK_BASE /raid1/soft/dockenv
source $DOCK_BASE/etc/login

Simple Database Generation

For automated database generation on small input files (say < 5000 molecules).

mkdir new_dir_name
cd new_dir_name
dbgen.csh INPUT
< or >
dbgen.csh INPUT [PROTONATION]

Options:

Making the new directory is a good idea because these scripts generate a LOT of output files. INPUT is a file containing the ligand molecules, either a .smi file containing lines of smiles strings and ids or some other file type easily converted to smiles (i.e. multi .mol2 or .sdf). The optional PROTONATION argument can be used to generate databases containing extended protonation states. The available protonation types are as follows: ref - only the reference protonation" mid - reference plus middle protonation [default]" lo - reference, middle, and lo protonation" hi - reference, middle, and hi protonation" all - all protonation ranges"

Caveats:

dbgen.csh is most useful when you want to test out a dockable database without worrying about ZINC. If you like the molecules and decide to add them to ZINC, it should be easy using the output of dbgen.csh. Please contact me (Michael Mysinger) if you want to do this at any time, as it should be easy but is untested at the moment. If you want to add the molecules to ZINC from the beginning then you can use the XML-RPC interface of DOCKBlaster like so:

xmlclient.py upload my.smi   # uploads ligands to server
xmlclient.py qup ID          # later on, get docking database back

where ID is the job id returned by the upload command.

Complex Database Generation

If you want to do automated database generation on a large scale (> 5000 molecules), then look here. First, you should note that this process is demanding and has been known to fill all space on the file servers, slam them into submission, or overload the entire SGE cluster. For estimation purposes, assume the processes take ~40GB of disk per 100k molecules.

Automatic Decoy Generation

Section 2

bullet3
bullet4

Automated Database Preparation: Difference between revisions

Revision as of 21:17, 29 October 2008

Contents

Automated Docking Database Tools

Automatic Database Generation

Simple Database Generation

Complex Database Generation

Automatic Decoy Generation

Section 2

Navigation menu

Automated Database Preparation: Difference between revisions

Revision as of 21:17, 29 October 2008

Automated Docking Database Tools

Automatic Database Generation

Simple Database Generation

Complex Database Generation

Automatic Decoy Generation

Section 2

Navigation menu

Search