Ucsfdock: Difference between revisions

From DISI
Jump to navigation Jump to search
m (asdf)
 
(40 intermediate revisions by one other user not shown)
Line 1: Line 1:
''ucsfdock'' is a Python package wrapping the [[DOCK|DOCK program]] that provides tools to help standardize and automate the computational methods employed in molecular docking.
See [[pydock3]].


Programs:
[[Category:Deprecated]]
* blastermaster: generate a specific docking configuration for a given receptor and ligand
* dockmaster: evaluate many different docking configurations in parallel using a specified job scheduler (e.g. Slurm)
 
A '''docking configuration''' is a unique set of DOCK parameter files and INDOCK parameter values.
 
= Installation =
 
TODO
 
= Instructions =
 
== Note for UCSF Shoichet Lab members ==
 
''ucsfdock'' is already installed to the following clusters. You can source the provided Python environment scripts to expose the relevant executables:
 
=== ''Wynton'' ===
 
source /wynton/home/irwin/isknight/envs/python3.8.5.sh
 
=== Gimel ( ===
 
Only nodes other than 'gimel' itself are supported, e.g., 'gimel5'.
 
source /nfs/soft/ian/python3.8.5.sh
 
== blastermaster ==
 
''blastermaster'' allows the generation of a specific docking configuration for a given receptor and ligand.
 
'''Note:''' Invoking ''blastermaster'' commands below will produce a log file called ''blastermaster.log'' in your current working directory.
 
=== blastermaster configure ===
 
First you need to create the directory for your blastermaster job. To do so, simply type
 
blastermaster configure
 
By default, the job directory is named ''blastermaster_job''. To specify a different name, type
 
blastermaster configure <JOB_DIR_NAME>
 
The job directory contains two sub-directories:
# ''working'': input files, intermediate blaster files, sub-directories for individual blastermaster subroutines
# ''dockfiles'': output files (DOCK parameter files & INDOCK)
 
If your current working directory contains any of the following files, then they will be automatically copied into the working directory within the created job directory.
 
* ''rec.pdb''
* ''xtal-lig.pdb''
* ''rec.crg.pdb''
* ''reduce_wwPDB_het_dict.txt''
* ''filt.params''
* ''radii''
* ''amb.crg.oxt''
* ''vdw.siz''
* ''delphi.def''
* ''vdw.parms.amb.mindock''
* ''prot.table.ambcrg.ambH''
 
This feature is intended to simplify the process of configuring the blastermaster job. If you would like to use files not present in your current working directory, copy them into your job's working directory, e.g.:
cp <FILE_PATH> <JOB_DIR_NAME>/working/
 
Finally, configure the ''blastermaster_config.yaml'' file in the job directory to your specifications. The parameters in this file govern the behavior of blastermaster.
 
=== blastermaster run ===
 
Once your job has been configured to your liking, navigate to the the job directory and run blastermaster:
cd <JOB_DIR_NAME>
blastermaster run
 
This will execute the many blastermaster subroutines in sequence. The state of the program will be printed to standard output as it runs.
 
== dockmaster ==
 
''dockmaster'' allows the evaluation of many different docking configurations in parallel using a specified job scheduler (e.g. Slurm).
 
The name "dockmaster", aside from being an uncreative rehash of the name "blastermaster", derives from the notion of a literal dockmaster, i.e., the person in charge of a dock who manages freight logistics and bosses around numerous dockworkers. In this analogy, a single dockworker corresponds to the processing of a single docking configuration.
 
'''Note:''' Invoking ''dockmaster'' commands will produce a log file called ''dockmaster.log'' in your current working directory.
 
=== dockmaster configure ===
First you need to create the directory for your blastermaster job. To do so, simply type
 
dockmaster configure
 
By default, the job directory is named ''dockmaster_job''. To specify a different name, type
 
dockmaster configure <JOB_DIR_NAME>
 
The job directory contains two sub-directories:
# ''working'': input files, intermediate blaster files, sub-directories for individual blastermaster subroutines
# ''retro_docking'': individual retro docking jobs for each docking configuration
 
The key difference between the working directories of ''blastermaster'' and ''dockmaster'' is that the working directory of ''dockmaster'' may contain multiple variants of the blaster files (prefixed by a number, e.g. "1_box"). These variant files are used to create the different docking configurations specified by the multi-valued entries of ''dockmaster_config.yaml''. They are created efficiently, such that the same variant used in multiple docking configurations is not created more than once.
 
If your current working directory contains any of the following files, then they will be automatically copied into the working directory within the created job directory.
 
* ''rec.pdb''
* ''xtal-lig.pdb''
* ''rec.crg.pdb''
* ''reduce_wwPDB_het_dict.txt''
* ''filt.params''
* ''radii''
* ''amb.crg.oxt''
* ''vdw.siz''
* ''delphi.def''
* ''vdw.parms.amb.mindock''
* ''prot.table.ambcrg.ambH''
 
This feature is intended to simplify the process of configuring the dockmaster job. If you would like to use files not present in your current working directory, copy them into your job's working directory, e.g.:
cp <FILE_PATH> <JOB_DIR_NAME>/working/
 
Finally, configure the ''dockmaster_config.yaml'' file in the job directory to your specifications. The parameters in this file govern the behavior of dockmaster.
 
=== environmental variables ===
 
Designate where the short cache and long cache should be located. E.g.:
 
export SHRTCACHE=/dev/shm  # temporary storage for job files
export LONGCACHE=/dev/shm  # long-term storage for files shared between jobs
 
In order for ''dockmaster'' to know which scheduler it should use, please configure the following environmental variables according to which one of the job schedulers you have.
 
==== Slurm ====
 
E.g., on the UCSF Shoichet Lab Gimel cluster (on any node other than 'gimel' itself, such as 'gimel5'):
 
export SBATCH_EXEC=/usr/bin/sbatch
export SQUEUE_EXEC=/usr/bin/squeue
 
==== SGE ====
 
E.g., on the UCSF Wynton cluster:
 
export QSTAT_EXEC=/opt/sge/bin/lx-amd64/qstat
export QSUB_EXEC=/opt/sge/bin/lx-amd64/qsub
 
The following is necessary on the UCSF Wynton cluster:
 
export SGE_SETTINGS=/opt/sge/wynton/common/settings.sh
 
On most clusters, this will probably be:
export SGE_SETTINGS=/opt/sge/default/common/settings.sh
 
=== dockmaster run ===
 
Once your job has been configured to your liking, navigate to the the job directory and run dockmaster:
cd <JOB_DIR_NAME>
dockmaster run <JOB_SCHEDULER_NAME>
 
where <JOB_SCHEDULER_NAME> is one of:
* ''sge''
* ''slurm''
 
This will execute the many dockmaster subroutines in sequence, including the submission of retro docking jobs for all docking configurations. The state of the program will be printed to standard output as it runs.
 
Once the dockmaster job is complete, the following files will be generated in the job directory:
* ''dockmaster_job_report.pdf'': contains (1) roc.png of best retro docking job, (2) box plots for every multi-valued config parameter, and (3) heatmaps for every pair of multi-valued config parameters
* ''dockmaster_job_results.csv'': enrichment metrics for each docking configuration
 
In addition, the best retro docking job will be copied to its own sub-directory ''best_retro_docking_job/''.
 
Within each retro docking job directory, there are the following files and sub-directories:
* ''working/'': intermediate files
* ''dockfiles/'': parameters files and INDOCK for given docking configuration
* ''output/'': contains sub-directories ''1/'' for actives and ''2/'' for decoys, each containing OUTDOCK and test.mol2 files
* ''retro_docking_job_results.csv'': data loaded from OUTDOCK files for both actives and decoys
* ''roc.png'': the ROC enrichment curve (log-scaled x-axis) for given docking configuration

Latest revision as of 22:56, 24 May 2024

See pydock3.