Juggler

Goal

To optimize your matching sphere (MS) setups getting faster docking and more high-scoring ligands with fewer spheres.

Description

The program performs optimization of matching spheres by pruning and stochastic optimization. It selects spheres from two sets:

heavy atoms of xtal-lig
spheres prepared by SPHGEN program

Juggler generates an initial MS set consisting of 100 spheres (maximum in DOCK 3.8). This set is used for retrospective docking, and then KDTree algorithm is used to prune the set to the required number of spheres by discarding all spheres that were not used in generation of the poses of the known binders (“actives”). This procedure is repeated to account for any differences in matching produced by reducing the MS set.

After this, the resulting set is transferred to the stepwise optimization procedure which conducts random perturbations of the sphere sets. Retrospective docking is done for each set, and sets are ranked by the

enrichment (normalized logAUC, see Ian's paper),
the average score of the top 1% of ligands.

The program consists of two main modules:

a Python script (juggler.py) that performs MS generation, optimization, and ranking.
a Bash script (rundockd.sh), that watches created directory structure, runs docking and processes docking results

Setup & Running

Setup

Dependencies: rdkit, pydock, subdock.

Preparation

What you need to prepare:

dockfiles directory with any tools of your liking (blastermaster, dockopt etc).
rec.crg.pdb
xtal-lig.pdb: To get RMSD of xtal-lig docked poses to the experimental pose, your xtal-lig.pdb must have correct bond orders and atom valences. You can edit it in Schrodinger and save as xtal-lig.pdb
ligands.names
decoys.names
sdi file with the paths to ligand .tgz files..

Prepare juggler_config.yml file. Put the config into an empty directory.

################################################
# Paths for your target
rec_crg_file_path: "/test/rec.crg.pdb"
xtal_lig_file_path: "/test/xtal-lig.pdb"
dock_files_dir_path: "/test/dockfiles"
lig_names_file_path: "/test/ligands.names"
dec_names_file_path: "/test/decoys.names"
sdi_file_path: "test/ligands_sdi"

################################################
# Executables and running
dockbase: "/path/to/DOCK"
dock64_bin: "path/to/dock64" 
subdock_bash_file_path: "/path/to/subdock.bash"
queue_type: "sge" # "slurm" or "sge"

###############################################
# Max and min number of spheres
min_sph: 4 # min is 4
max_sph: 10 # max is 100

dock64_bin parameter is optional, if this parameter is absent, {dockbase}/docking/DOCK/bin/dock64 will be used.

Running

You can either

Enter a screen environment so your run is not interrupted if you disconnect your SSH session, or
run Juggler using a queuing system. See example files for the slurm and sge below.

In both cases you need to launch Juggler and docking daemon simultaneously.

In a screen

source /path/to/python/env
# rundockd should run in the background to manage docking jobs
sh rundockd.sh 2>&1 > rundockd.log &
python juggler.py 2&>1 > juggler.log

Via a queue

SGE (Wynton)

#! /bin/bash
#$ -cwd
#$ -q long.q
#$ -o stdout_juggler
#$ -e stdout_juggler
#$ -l s_rt=72:58:00
#$ -l h_rt=73:00:00
#$ -l mem_free=10G
#$ -pe smp 2
source /wynton/group/bks/soft/python_envs/env.sh
sh /wynton/group/bks/work/ak87/UCSF/JUGGLER/SCRIPTS/RELEASE/rundockd.sh 2>&1 > rundockd.log & #/dev/null &
python /wynton/group/bks/work/ak87/UCSF/JUGGLER/SCRIPTS/RELEASE/juggler.py 2>&1 > juggler.log

SLURM (Gimel)

#! /bin/bash
#$ -cwd
#$ -q long.q
#$ -o stdout_juggler
#$ -e stdout_juggler
#$ -l s_rt=23:58:00
#$ -l h_rt=24:00:00
#$ -l mem_free=10G
source /nfs/soft/ian/python3.8.5.sh
sh /mnt/nfs/exa/work/ak87/UCSF/JUGGLER/SCRIPTS/RELEASE/rundockd.sh 2>&1 > rundockd.log & #/dev/null &
python /mnt/nfs/exa/work/ak87/UCSF/JUGGLER/SCRIPTS/RELEASE/orbebb.py 2>&1 > juggler.log

UCSF clusters

The scripts and example config file are in

Wynton

/wynton/group/bks/work/ak87/UCSF/JUGGLER/SCRIPTS/RELEASE

Gimel

/mnt/nfs/exa/work/ak87/UCSF/JUGGLER/SCRIPTS/RELEASE

Queue type is sge for Wynton and slurm for Gimel (newer machines, like gimel5/gimel2/n-1-XXX...).

Processing results

At the end of a run you will get a message that convergence was reached. You will see the directory best_set that contains dockfiles and docking results for the best matching sphere set found. This directory is updated at each step, so if the run fails or convergence is not reached, you can still access the optimal set.

Other files are

stepwise_opt_best_sets.dat lists the IDs and the nlogAUC values for the best set in each stepwise optimization round
stepwise_opt_metrics.dat lists IDs, nlogAUC, RMSD and average scores for the top 1% ligands for all sets tested during the stepwise optimization
juggler.log contains all the data for the run.

Juggler

Contents

Goal

Description

Setup & Running

Setup

Preparation

Running

In a screen

Via a queue

SGE (Wynton)

SLURM (Gimel)

UCSF clusters

Processing results

Navigation menu

Juggler

Goal

Description

Setup & Running

Setup

Preparation

Running

In a screen

Via a queue

SGE (Wynton)

SLURM (Gimel)

UCSF clusters

Processing results

Navigation menu

Search