Tutorial on running Molecular Dynamics for GIST grid generation

From DISI
Revision as of 16:34, 23 January 2017 by TBalius (talk | contribs)
Jump to navigation Jump to search

Tutorial written by Trent Balius (Jan. 9, 2017).

Disclaimer

This is foremost for training Shoichet lab members. But we hope that the community finds this useful.

To use this tutorial you will need the following:

  • AMBER12 or higher.
  • AMBERTOOLS.
  • GPUs to run AMBER on. In our case we uses an SGE queuing system to manage these jobs.
  • It is helpful if you have DOCK3.7 the latest version.

Introduction

Grid inhomogeneous Solvation Theory (GIST) is a method for calculating thermodynamic properties of water on GIST lattice.

Here, we will use these grids for DOCKing (although, there are many other uses).

Set up environment

you can add the following to your environment file (.cshrc) for just issue them on the command line.

source python with bio_python package.

source /nfs/soft/python/envs/complete/latest/env.csh

set DOCKBASE

setenv DOCKBASE "/nfs/home/tbalius/zzz.github/DOCK"

set AMBERHOME

 setenv AMBERHOME /nfs/soft/amber/amber14

Put msms in your path

set path = ( /nfs/home/tbalius/zzz.programs/msms $path )

Set up directories

cd /nfs/work/yourusername/
mkdir ambergisttutorial
cd ambergisttutorial 

Prepare for AMBER

  • download 4NVE form the PDB brake it into receptor and ligand pieces.
  • download pdb 4NVA form the PDB align it to 4NVE and out put waters that are close to the ligand.
  • here is one way to do it:

(1) Break up the protein from the ligand. run be_blasti.py

 $DOCKBASE/proteins/pdb_breaker/be_blasti.py --pdbcode 4NVE nocarbohydrate original_numbers | tee -a pdbinfo_using_biopython.log
 $DOCKBASE/proteins/pdb_breaker/be_blasti.py --pdbcode 4NVA nocarbohydrate original_numbers | tee -a pdbinfo_using_biopython.log
 cd 4NVE 
 sed -e "s/HETATM/ATOM  /g" lig.pdb > xtal-lig.pdb
 cd ../
 cd 4NVA
 grep HOH $mountdir/workingdir/4NVA/4NVA_A.pdb > $workdir/water.pdb
 cd ../

(2) Next align the two receptors (4NVE to 4NVA) and find those waters close to the ligand.

 mkdir aligned
 cd aligned

create file named "chimera.com" and write the following into the file:

 # Gist template #0
 open ../4NVA/rec.pdb
 # 4NVE rec
 open ../4NVE/rec.pdb
 # 4NVE lig
 open ../4NVE/lig.pdb
 # 4NVA waters
 open ../4NVA/water.pdb
 mmaker #0 #1
 matrixcopy #1 #2
 sel #3:HOH & #2 z<8
 del #3 & ~sel
 write format pdb  0 template.pdb
 write format pdb  1 aligned.rec.pdb
 write format pdb  2 aligned.lig.pdb
 write format pdb  3 close.waters.pdb
 /nfs/soft/chimera/current/bin/chimera --nogui chimera.com > & chimera.com.out
 cd ../

(3) Now lets run tleap.

(3.1) run leap to renumber residues.

mkdir tleap
cd tleap
cp ../aligned/template.pdb rec.pdb
cp ../aligned/close.waters.pdb wat.pdb

produces tleap input file named tleap.rec.1.in

set default PBradii mbondi2
# load the protein force field
source leaprc.ff14SB
loadamberparams frcmod_teb_mf_mod.hemall
loadamberprep heme_all_plus1.in
REC = loadpdb rec.pdb
# draw bond between HIS and HEME
# 5 coordenated iron.
bond REC.175.NE2 REC.293.FE
saveamberparm REC rec.1.leap.prm7 rec.1.leap.rst7
quit
$AMBERHOME/bin/tleap -s -f tleap.rec.1.in > ! tleap.rec.1.out
$AMBERHOME/bin/ambpdb -p rec.1.leap.prm7 < rec.1.leap.rst7 >! rec.1.leap.pdb

3.2 requires script 0000 to be run first and align/ dir to be present

reduce -HIS -FLIPs rec.1.leap.pdb >! rec.nowat.H_mod.pdb
sed -i 's/HETATM/ATOM  /g' rec.nowat.H_mod.pdb
grep "^ATOM  " rec.nowat.H_mod.pdb | sed -e 's/   new//g' | sed 's/   flip//g' | grep -v "OXT" | grep -v " 0......HEM" >! rec.nowat.H_mod1.pdb
curl docking.org/~tbalius/code/waterpaper2017/scripts/replace_his_with_hie_hid_hip.py > replace_his_with_hie_hid_hip.py
curl docking.org/~tbalius/code/waterpaper2017/scripts/replace_cys_to_cyx.py > replace_cys_to_cyx.py
curl docking.org/~tbalius/code/waterpaper2017/scripts/add.ters.py > add.ters.py
python replace_his_with_hie_hid_hip.py rec.nowat.H_mod1.pdb rec.nowat.H_mod2.pdb

The following script is useful if your protein has disulfides:

python replace_cys_to_cyx.py rec.nowat.H_mod2.pdb rec.nowat.H_mod3.pdb

output may be interted into the tleap input below to draw bonds. But this is not need here.

python add.ters.py rec.nowat.H_mod3.pdb rec.nowat.H_mod4.pdb
mv rec.nowat.H_mod4.pdb rec.nowat.H_final.pdb


(3.3) produces tleap input file (past on line at a time into your terminal, be sure not to have a space before the bottom EOF.

download HEME parameters:

curl docking.org/~tbalius/code/waterpaper2017/parms/frcmod_teb_mf_mod.hemall > frcmod_teb_mf_mod.hemall
curl docking.org/~tbalius/code/waterpaper2017/parms/heme_all_plus1.in > heme_all_plus1.in
cat << EOF >! tleap.rec.in
set default PBradii mbondi2
# load the protein force field
source leaprc.ff14SB
loadamberparams frcmod_teb_mf_mod.hemall
loadamberprep heme_all_plus1.in
REC = loadpdb rec.nowat.H_final.pdb
# draw bond between HIS and HEME
# 5 coordenated iron.
bond REC.175.NE2 REC.293.FE
XTALWAT = loadpdb wat.pdb
RECWAT  = combine {REC XTALWAT}
saveamberparm RECWAT rec.leap.prm7 rec.leap.rst7
solvateBox RECWAT TIP3PBOX 10.0
saveamberparm RECWAT rec.watbox.leap.prm7 rec.watbox.leap.rst7
quit
EOF


$AMBERHOME/bin/tleap -s -f tleap.rec.in > ! tleap.rec.out

Run AMBER

(1) Equilibrate

setenv LD_LIBRARY_PATH "/usr/local/cuda-6.0/lib64/:$LD_LIBRARY_PATH"

parmeters to add if writting in NetCDF

  • ioutfm=1, write out NetCDF trajectory
  • ntxo = 2, write out NetCDF restart
  • ntrx = 2, read in NetCDF restart
  • VMD does not read it in.
  • half the size
cat << EOF1 > ! 01mi.in
01mi.in: equil minimization with Cartesian restraints
&cntrl
   imin=1, maxcyc=3000, ncyc = 1500,
   ntpr=100,
   ntr=1,
   restraint_wt=100.0,
   restraintmask= ':1-300 & !@H'
/
EOF1
cat << EOF1 > ! 02mi.in
02mi.in: equil minimization with Cartesian restraints
&cntrl
   imin=1, maxcyc=3000, ncyc = 1500,
   ntpr=100,
   ntr=1,
   restraint_wt=5.0,
   restraintmask= ':1-292 & !@H'
/
EOF1


cat << EOF1 > ! 01md.in
01md.in: equilibration, constant volume, constant temp at 50.00 run for 20ps (10000 steps).
&cntrl
   imin=0, irest=0, ntx=1, nstlim = 10000,
   ntt=3, temp0=50.0, gamma_ln=2.0, ig = 1234,
   ntp=0, taup=2.0,
   ntb=1, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1
cat << EOF1 > ! 02md.in
02md.in: equilibration, constant volume, constant temp at 100.00 run for 20ps (10000 steps).
&cntrl
   imin=0, irest=1, ntx=5, nstlim = 10000,
   ntt=3, temp0=100.0, gamma_ln=2.0, ig = 2345,
   ntp=0, taup=2.0,
   ntb=1, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1
cat << EOF1 > ! 03md.in
03md.in: equilibration, constant volume, constant temp at 150.00 run for 20ps (10000 steps).
&cntrl
   imin=0, irest=1, ntx=5, nstlim = 10000,
   ntt=3, temp0=150.0, gamma_ln=2.0, ig = 3456,
   ntp=0, taup=2.0,
   ntb=1, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1
cat << EOF1 > ! 04md.in
04md.in: equilibration, constant volume, constant temp at 200.00 run for 20ps (10000 steps).
&cntrl
   imin=0, irest=1, ntx=5, nstlim = 10000,
   ntt=3, temp0=200.0, gamma_ln=2.0, ig = 4567,
   ntp=0, taup=2.0,
   ntb=1, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1
cat << EOF1 > ! 05md.in
05md.in: equilibration, constant volume, constant temp at 250.00 run for 20ps (10000 steps).
&cntrl
   imin=0, irest=1, ntx=5, nstlim = 10000,
   ntt=3, temp0=250.0, gamma_ln=2.0, ig = 5678,
   ntp=0, taup=2.0,
   ntb=1, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1
cat << EOF1 > ! 06md.in
06md.in: equilibration, constant volume, constant temp at 298.15 run for 20ps (10000 steps).
&cntrl
   imin=0, irest=1, ntx=5, nstlim = 10000,
   ntt=3, temp0=298.15, gamma_ln=2.0, ig = 6789,
   ntp=0, taup=2.0,
   ntb=1, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1
cat << EOF1 > ! 07md.in
07md.in: equilibration, constant presure, constant temp at 298.15 run for 5ns (2500000 steps).
&cntrl
   imin=0, irest=1, ntx=5, nstlim = 2500000,
   ntt=3, temp0=298.15, gamma_ln=2.0, ig = 12345,
   ntp=1, taup=2.0,
   ntb=2, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1
cat << EOF1 > ! 08md.in
08md.in: equilibration, constant volume, constant temp at 298.15 run for 5ns (2500000 steps).
&cntrl
   imin=0, irest=1, ntx=5, nstlim = 2500000,
   ntt=3, temp0=298.15, gamma_ln=2.0, ig = 23456,
   ntp=0, taup=2.0,
   ntb=1, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1

Make the file qsub.amber.equil.csh using vim (or your favorite text editor).

#$ -S /bin/csh
#$ -cwd
#$ -q gpu.q
#$ -o stdout
#$ -e stderr

# on-one-gpu is a used in the Shoichet lab for managing our gpus and may be removed. 
# setenv CUDA_VISIBLE_DEVICES "0,1,2,3"
setenv AMBERHOME /nfs/soft/amber/amber14/
set amberexe = "/nfs/ge/bin/on-one-gpu - \$AMBERHOME/bin/pmemd.cuda"

set filedir = `pwd`

# make a local directory on the server to run calculations
set SCRATCH_DIR = /scratch
if ! (-d $SCRATCH_DIR ) then
    SCRATCH_DIR=/tmp
endif
set username = `whoami`

set TASK_DIR = "\$SCRATCH_DIR/\${username}/$JOB_ID"
echo $TASK_DIR

mkdir -p ${TASK_DIR}
cd ${TASK_DIR}
pwd

cp ${filedir}/rec.watbox.leap.* .
cp ${filedir}/*.in . 

 $amberexe -O -i 01mi.in -o 01mi.out -p rec.watbox.leap.prm7 \
 -c rec.watbox.leap.rst7 -ref rec.watbox.leap.rst7 \
 -x 01mi.mdcrd -inf 01mi.info -r 01mi.rst7

 $amberexe -O -i 02mi.in -o 02mi.out -p rec.watbox.leap.prm7 \
 -c 01mi.rst7 -ref 01mi.rst7 \
 -x 02mi.mdcrd -inf 02mi.info -r 02mi.rst7

 # we should consider runing a step of NPT here first to at low temp to ajust the box size.
 # then swiching to fixed volume to rase the tempature.
 
 $amberexe -O -i 01md.in -o 01md.out -p rec.watbox.leap.prm7 \
 -c 02mi.rst7 -ref 02mi.rst7 -x 01md.mdcrd -inf 01md.info -r 01md.rst7

 # gzip 01md.mdcrd

 $amberexe -O -i 02md.in -o 02md.out -p rec.watbox.leap.prm7 \
 -c 01md.rst7 -ref 01md.rst7 -x 02md.mdcrd -inf 02md.info -r 02md.rst7

 $amberexe -O -i 03md.in -o 03md.out -p rec.watbox.leap.prm7 \
 -c 02md.rst7 -ref 01md.rst7 -x 03md.mdcrd -inf 03md.info -r 03md.rst7

 $amberexe -O -i 04md.in -o 04md.out -p rec.watbox.leap.prm7 \
 -c 03md.rst7 -ref 01md.rst7 -x 04md.mdcrd -inf 04md.info -r 04md.rst7

 $amberexe -O -i 05md.in -o 05md.out -p rec.watbox.leap.prm7 \
 -c 04md.rst7 -ref 01md.rst7 -x 05md.mdcrd -inf 05md.info -r 05md.rst7

 $amberexe -O -i 06md.in -o 06md.out -p rec.watbox.leap.prm7 \
 -c 05md.rst7 -ref 01md.rst7 -x 06md.mdcrd -inf 06md.info -r 06md.rst7

 $amberexe -O -i 07md.in -o 07md.out -p rec.watbox.leap.prm7 \
 -c 06md.rst7 -ref 01md.rst7 -x 07md.mdcrd -inf 07md.info -r 07md.rst7

 $amberexe -O -i 08md.in -o 08md.out -p rec.watbox.leap.prm7 \
 -c 07md.rst7 -ref 07md.rst7 -x 08md.mdcrd -inf 08md.info -r 08md.rst7

mv $TASK_DIR $filedir


qsub qsub.amber.produc1.csh

Wait for simulations to finish and then check the results. Make sure that 07md.in finished (this is when the box adjusts.

If 07md.in run an additional step between 06md.in and 07md.in.

cat << EOF1 > ! 07md_before.in
07md.in: equilibration, constant presure, constant temp at 298.15 run for 20ps (10000 steps).
&cntrl
   imin=0, irest=1, ntx=5, nstlim = 10000,
   ntt=3, temp0=298.15, gamma_ln=2.0, ig = 12345,
   ntp=1, taup=2.0,
   ntb=2, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-292 & !@H', restraint_wt = 5.0,
/
EOF1

Modify the above scripted as follows.

 $amberexe -O -i 06md.in -o 06md.out -p rec.watbox.leap.prm7 \
 -c 05md.rst7 -ref 01md.rst7 -x 06md.mdcrd -inf 06md.info -r 06md.rst7

 $amberexe -O -i 07md_before.in -o 07md_before.out -p rec.watbox.leap.prm7 \
 -c 06md.rst7 -ref 01md.rst7 -x 07md_before.mdcrd -inf 07md_before.info -r 07md_before.rst7

 $amberexe -O -i 07md.in -o 07md.out -p rec.watbox.leap.prm7 \
 -c 07md_before.rst7 -ref 01md.rst7 -x 07md.mdcrd -inf 07md.info -r 07md.rst7

(2) Production

cat << EOF1 > ! 09md.in
09md.in: production. constant volume, constant temp at 298.15 run for 5ns (2500000 steps).
&cntrl
   ioutfm=1, # write out NetCDF trajectory
   ntwr=1000000000,
   imin=0, irest=1, ntx=5, nstlim = 2500000,
   ntt=3, temp0=298.15, gamma_ln=2.0, ig = 34567,
   ntp=0, taup=2.0,
   ntb=1, ntc=2, ntf=2,
   ntwx=500, ntpr=500, dt = 0.002,
   ntr = 1, iwrap=1,
   restraintmask = ':1-290 & !@H', restraint_wt = 5.0,
/
EOF1

Make the file qsub.amber.produc1.csh using vim (or your favorite text editor).

#$ -S /bin/csh
#$ -cwd
#$ -q gpu.q
#$ -o stdout
#$ -e stderr

# on-one-gpu is a used in the Shoichet lab for managing our gpus and may be removed. 
# setenv CUDA_VISIBLE_DEVICES "0,1,2,3"
setenv AMBERHOME /nfs/soft/amber/amber14/
set amberexe = "/nfs/ge/bin/on-one-gpu - \$AMBERHOME/bin/pmemd.cuda"

set filedir = `pwd`

# make a local directory on the server to run calculations
set SCRATCH_DIR = /scratch
if ! (-d $SCRATCH_DIR ) then
    SCRATCH_DIR=/tmp
endif
set username = `whoami`

set TASK_DIR = "\$SCRATCH_DIR/\${username}/$JOB_ID"
echo $TASK_DIR

mkdir -p ${TASK_DIR}
cd ${TASK_DIR}
pwd

cp ${filedir}/rec.watbox.leap.* .
cp ../09.in .
cp ../*/07md.rst7 . 

cat 09md.in | sed 's/09md/10md/g' | sed 's/34567/10101/g'   > 10md.in
cat 09md.in | sed 's/09md/11md/g' | sed 's/34567/111/g'     > 11md.in
cat 09md.in | sed 's/09md/12md/g' | sed 's/34567/2121212/g' > 12md.in
cat 09md.in | sed 's/09md/13md/g' | sed 's/34567/131313/g'  > 13md.in
cat 09md.in | sed 's/09md/14md/g' | sed 's/34567/4443/g'    > 14md.in

 # stat production
 
 $amberexe -O -i 09md.in -o 09md.out -p rec.watbox.leap.prm7 \
 -c 08md.rst7 -ref 07md.rst7 -x 09md.mdcrd -inf 09md.info -r 09md.rst7

 $amberexe -O -i 10md.in -o 10md.out -p rec.watbox.leap.prm7 \
 -c 09md.rst7 -ref 07md.rst7 -x 10md.mdcrd -inf 10md.info -r 10md.rst7

 $amberexe -O -i 11md.in -o 11md.out -p rec.watbox.leap.prm7 \
 -c 10md.rst7 -ref 07md.rst7 -x 11md.mdcrd -inf 11md.info -r 11md.rst7

 $amberexe -O -i 12md.in -o 12md.out -p rec.watbox.leap.prm7 \
 -c 11md.rst7 -ref 07md.rst7 -x 12md.mdcrd -inf 12md.info -r 12md.rst7

 $amberexe -O -i 13md.in -o 13md.out -p rec.watbox.leap.prm7 \
 -c 12md.rst7 -ref 07md.rst7 -x 13md.mdcrd -inf 13md.info -r 13md.rst7

 $amberexe -O -i 14md.in -o 14md.out -p rec.watbox.leap.prm7 \
 -c 13md.rst7 -ref 07md.rst7 -x 14md.mdcrd -inf 14md.info -r 14md.rst7

mv $TASK_DIR $filedir


qsub qsub.amber.produc1.csh

After the above script finishes running on the gpu queue. Make sure all is in order and extend the simulation.

Make the file qsub.amber.produc2.csh using vim (or your favorite text editor).

#$ -S /bin/csh
#$ -cwd
#$ -q gpu.q
#$ -o stdout
#$ -e stderr

# on-one-gpu is a used in the Shoichet lab for managing our gpus and may be removed. 
# setenv CUDA_VISIBLE_DEVICES "0,1,2,3"
setenv AMBERHOME /nfs/soft/amber/amber14/
set amberexe = "/nfs/ge/bin/on-one-gpu - \$AMBERHOME/bin/pmemd.cuda"

set filedir = `pwd`

# make a local directory on the server to run calculations
set SCRATCH_DIR = /scratch
if ! (-d $SCRATCH_DIR ) then
    SCRATCH_DIR=/tmp
endif
set username = `whoami`

set TASK_DIR = "\$SCRATCH_DIR/\${username}/$JOB_ID"
echo $TASK_DIR

mkdir -p ${TASK_DIR}
cd ${TASK_DIR}
pwd

cp ${filedir}/rec.watbox.leap.* .
cp ../09.in .
cp ../*/07md.rst7 . 

cat 09md.in | sed 's/09md/10md/g' | sed 's/34567/10101/g'   > 10md.in
cat 09md.in | sed 's/09md/11md/g' | sed 's/34567/111/g'     > 11md.in
cat 09md.in | sed 's/09md/12md/g' | sed 's/34567/2121212/g' > 12md.in
cat 09md.in | sed 's/09md/13md/g' | sed 's/34567/131313/g'  > 13md.in
cat 09md.in | sed 's/09md/14md/g' | sed 's/34567/4443/g'    > 14md.in

 # extend production
 
 \$amberexe -O -i 15md.in -o 15md.out -p rec.watbox.leap.prm7 \
 -c 14md.rst7 -ref 07md.rst7 -x 15md.mdcrd -inf 15md.info -r 15md.rst7

 \$amberexe -O -i 16md.in -o 16md.out -p rec.watbox.leap.prm7 \
 -c 15md.rst7 -ref 07md.rst7 -x 16md.mdcrd -inf 16md.info -r 16md.rst7

 \$amberexe -O -i 17md.in -o 17md.out -p rec.watbox.leap.prm7 \
 -c 16md.rst7 -ref 07md.rst7 -x 17md.mdcrd -inf 17md.info -r 17md.rst7

 \$amberexe -O -i 18md.in -o 18md.out -p rec.watbox.leap.prm7 \
 -c 17md.rst7 -ref 07md.rst7 -x 18md.mdcrd -inf 18md.info -r 18md.rst7

mv \$TASK_DIR $pwd 


(3) Restart

Some times the simulation will fail to run properly, so you may have to restart it.

Run GIST post processing

Combining GIST grids

Run convergence analysis