Zack useful snippets

From DISI
Jump to navigation Jump to search

These are small tidbits of info collected over time that are too unpolished for their own page, but useful enough for someone at some point, maybe. They may not be fully reliable.

DB2 to MOL2

Use the following python2 script:

/nfs/home/zack/software/zack_bks_scripts/db2_to_mol2/db2_to_mol2.py

MOL2 to DB2

The Docker image lives on epyc:

(pydock_env) [zack@epyc building_single_mol2]$ docker images | grep building
building_single_mol2          latest            cdadaf7a2ffc   6 minutes ago   3.94GB

The script to use is at:

~/software/zack_bks_scripts/build_single_pose.sh
#!/bin/bash
# Usage: build_single_pose.sh <input.mol2> <output_dir> [pose_name]
#
# Builds a single-conformer DB2 from a mol2 pose.
# Output: <output_dir>/bundle.db2.tgz  (contains <pose_name>.<charge>.db2)

set -euo pipefail

if [ "$#" -lt 2 ] || [ "$#" -gt 3 ]; then
    echo "Usage: $0 <input.mol2> <output_dir> [pose_name]" >&2
    exit 1
fi

INPUT_MOL2=$(realpath "$1")
OUTPUT_DIR=$(realpath "$2")
POSE_NAME="${3:-}"   # optional; derived from mol2 name if blank

if [ ! -f "$INPUT_MOL2" ]; then
    echo "Error: input mol2 not found: $INPUT_MOL2" >&2
    exit 1
fi

mkdir -p "$OUTPUT_DIR"

# Stage input into a work dir the container will see as /data
WORKDIR=$(mktemp -d /tmp/pose_work_XXXXXX)
chmod 777 "$WORKDIR"
trap "rm -rf $WORKDIR" EXIT

cp "$INPUT_MOL2" "$WORKDIR/input.mol2"

# Scratch dir for the container's /tmp
SCRATCH=$(mktemp -d /tmp/pose_scratch_XXXXXX)
chmod 777 "$SCRATCH"
trap "rm -rf $WORKDIR $SCRATCH" EXIT

docker run --rm \
    -u "$(id -u):$(id -g)" \
    -v "$WORKDIR:/data" \
    -v "$SCRATCH:/tmp" \
    -e POSE_NAME="$POSE_NAME" \
    building_single_mol2 \
    bash -c "source /pyenv/bin/activate && cd /tmp && python /build_single_pose.py"

# Move the result to the requested output dir
if [ ! -f "$WORKDIR/bundle.db2.tgz" ]; then
    echo "Error: build failed — no bundle.db2.tgz produced" >&2
    exit 1
fi

mv "$WORKDIR/bundle.db2.tgz" "$OUTPUT_DIR/bundle.db2.tgz"
echo "Done: $OUTPUT_DIR/bundle.db2.tgz"
tar -tzf "$OUTPUT_DIR/bundle.db2.tgz"

Example usage:

# auto-derive pose name from the input mol2
./build_single_pose.sh xtal-lig_4mer_round2.mol2 /path/to/output

# or override the pose name explicitly
./build_single_pose.sh xtal-lig_4mer_round2.mol2 /path/to/output 4mer_WT

Make sure the final bundle is in the format:

bundle.db2.tgz/{LIG_NAME}.dock.db2

Alternative method

This approach runs on n-1-21 but has not been fully verified:

# on n-1-21
conda deactivate
source /nfs/soft/dock/versions/dock37/DOCK-3.7.5.0/env.sh
export DOCKBASE="/nfs/soft/dock/versions/dock38/DOCK/ucsfdock"
$DOCKBASE/ligand/generate/build_ligand.sh xtal-lig_4mer_round2.mol2 \
  --name="4MER" \
  --smiles="NC(=[NH2+])NCCC[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)CO)C(=O)NCCC(N)=O"

# Last step will break. Exit out of n-1-21.
conda deactivate
export DOCKBASE="/nfs/soft/dock/versions/dock38/DOCK/ucsfdock"
source /nfs/soft/dock/versions/dock385/env.sh

# Go back to the dir and re-run:
$DOCKBASE/ligand/generate/build_ligand.sh xtal-lig_4mer_round2.mol2 \
  --name="4MER" \
  --smiles="NC(=[NH2+])NCCC[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)CO)C(=O)NCCC(N)=O"

# It should error out earlier, but package the results into db2

Again, make sure the final bundle is in the format:

bundle.db2.tgz/{LIG_NAME}.dock.db2

Scoring a Single MOL2

Based on the Rescoring with DOCK 3.7 wiki page, specifically using rescoring.tar.gz.

Run the following steps in order on n-1-21:

# Set up environment
conda deactivate
source /nfs/soft/dock/versions/dock385/env.sh
export DOCKBASE="/nfs/soft/dock/versions/dock38/DOCK/ucsfdock"
source /nfs/soft/dock/versions/dock37/DOCK-3.7.5.0/env.sh

cp -r /nfs/home/zack/software/zack_bks_scripts/pose_build_and_rescore_dock3/ ./
python convert_anyMol2_to_dockMol2.py {YOUR_MOL2} {DOCKABLE_MOL2} {LIG_NAME}
csh 1.run.rescore_prep.csh {DOCKABLE_MOL2}

# There should now be: poses.mol2.gz, amsol.txt.gz, vdw.txt.gz
# The dock binary included is for convenience only
# INDOCK has search_type and other fields that point to the built pose
./dock64 INDOCK > OUTDOCK

Environment for AMSOL

conda deactivate
source /nfs/soft/dock/versions/dock385/env.sh
export DOCKBASE="/nfs/soft/dock/versions/dock38/DOCK/ucsfdock"
source /nfs/soft/dock/versions/dock37/DOCK-3.7.5.0/env.sh

Generating More Conformers in Ligand Building

A new Docker image building_configurable_omega was created to expose OMEGA tuning parameters as environment variables. Only epyc has this image. It also prints the number of conformers saved into db2 in the log.

Example SLURM script:

#!/bin/bash
#SBATCH --output=active_building/logs/slurm-%A_%a.out
#SBATCH --array=1-2
#SBATCH --time=01:00:00
#SBATCH --mem=2500M
#SBATCH --nodelist=epyc

TMPDIR=$(mktemp -d /scratch/job_${SLURM_JOB_ID}_${SLURM_ARRAY_TASK_ID}_XXXXXX)
trap "rm -rf $TMPDIR" EXIT
export INDIR="/mnt/nfs/exk/work/zack/npsr1/UCSF_DOCK/setup_prep/round2/decoys/built_decoys/active_building/${SLURM_ARRAY_TASK_ID}"
newgrp docker << EOF
docker run --rm -u $(id -u):$(id -g) -v ${INDIR}:/data -v ${TMPDIR}:/tmp \
-e OMEGA_MAX_CONFS=2000 \
-e OMEGA_MAX_SEARCH_TIME=3600.0 \
building_configurable_omega bash /dock/ligand/submit/build-docker.sh
EOF

Available OMEGA tuning knobs:

Variable Default Notes
OMEGA_MAX_CONFS 600 Set to 0 for rotor-dependent auto-scaling
OMEGA_ENERGY_WINDOW 12 kcal/mol; tighter = fewer confs
OMEGA_RMSD 0.5 Å; larger = fewer confs (more pruning)
OMEGA_TORLIB GubaV21 Or "Original"
OMEGA_FF MMFF94Smod Force field
OMEGA_HARD_CODED_TOR_PATTERN 1 Amide/guanidinium planarity constraints
OMEGA_MAX_SEARCH_TIME 120.0 Previously hardcoded; now configurable


Useful DOCK3.8 binaries

On gimel / bks cluster:

(base) -sh-4.2$ pwd
/nfs/home/zack/software/dock_binaries
(base) -sh-4.2$ ls
dock64_light_mol2  dock64_no_maxnode_limit_divya  dock64_verbose_no_viable_poses  dock64_verbose_no_viable_poses_and_safe_db2_load_fail