Membrane Modeling

From DISI
Revision as of 01:27, 21 April 2022 by Iamkaant (talk | contribs) (Added protocol for MD in Schrodinger)
Jump to navigation Jump to search

Written by Stefan Gahbauer, 2019/11/03

In order to account for ligand desolvation and electrostatic interactions in the low-dielectric environment of the hydrophobic membrane core, a lipid-bilayer is generated around the target receptor and included in the docking score grid generation. Aiming at a fast, robust and computationally effective equilibration of the lipid bilayer around the embedded transmembrane receptor, coarse-grained (CG) molecular dynamics (MD) simulations and (if needed) subsequent atomistic simulations are employed.


Required software and datasets

Gromacs (v5 or newer) - Molecular Dynamics software package (http://manual.gromacs.org/)

CHARMM36m force field (http://mackerell.umaryland.edu/charmm_ff.shtml)

MARTINI Coarse-grained force field parameters(http://cgmartini.nl/)

DSSP - Secondary Structure assignment (https://swift.cmbi.umcn.nl/gv/dssp/ , https://anaconda.org/salilab/dssp)

martinize.py - Coarse-graining atomistic protein structures (http://cgmartini.nl/index.php/tools2/proteins-and-bilayers)

insane.py - INSerting proteins in coarse-grained MembrANE (http://cgmartini.nl/index.php/tools2/proteins-and-bilayers)

initram.sh and backward.py - Conversion of coarse-grained system to atomistic resolution (http://cgmartini.nl/index.php/tools2/resolution-transformation)


1) Setting up the coarse-grained system

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0001-prepare-protein-CG-membrane.sh

1.1) Prepare your files

Copy the script above to your working directory.

Copy your rec.pdb to your working directory.

If your rec.pdb has gaps, e.g. unresolved loops between transmembrane helices in case of GPCRs, try to model missing residues.

One way is to use MODELLER following https://salilab.org/modeller/wiki/Missing%20residues.

Corresponding input scripts for modeller can be found in:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/modeller

1.2) Run the script

Login to gimel2.

./0001-prepare-protein-CG-membrane.sh

The script reads rec.pdb and copies all other required files from

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/gromacs

Workflow

a. Generate CHARMM36m force field parameters of your protein in a Gromacs-readable format.

Used tool: gmx pdb2gmx

Output files are stored in the generated pdb2gmx directory

-conf.gro / conf.pdb - Gromacs coordinate file

-topol.top / Protein-atomistic.itp - Gromacs topology file, i.e. force field description of your input structure

-posre.itp - Position restraints for heavy atoms of atomistic protein strucutre.


b. Build coarse-grained structure

Used tool: martinize.py

Output files are stored in the generated martini directory.

-chain_.ssd - Output from the DSSP program that is called by martinize.py

-prot-cg.pdb - Coarse-grained protein structure

-prot-cg.top - Coarse-grained Martini topology of system

-Protein.itp - Coarse-grained Martini description of Protein structure

-prot-rot.pdb - Coarse-grained protein structure aligned along z-axis of the simulation box according to the proteins first principal component axis. This ensures the correct placement of the protein during membrane preparation. You may have to adjust the orientation of your input structure prior to membrane modeling.


c. Build coarse-grained membrane

Used tool: insane.py

Here, a lipid bilayer will be created around the protein structure (in the x/y-plane) and water will be added to the system. The default box shape is rectangular and the size is set to x,y=10nm, z=11nm. This can be changed in the ./insane.py command line. The default lipid type is POPC, you can change that to arbitrary lipid compositions using the -l and -u options of of insane.py.

Output files are stored in the martini directory.

-out.top / topol-cg.top - Topology of coarse-grained system

-cg-membrane.gro/.pdb - Coarse-grained system coordinates. Carefully inspect and visualize the cg-membrane.pdb.

Use PyMOL to check if you're protein is embedded correctly in the lipid bilayer.


2) Simulating the coarse-grained system

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0002-run-CG-Minimization-and-MD.sh

2.1) Run the script

Copy the script above to your working directory.

./0002-run-CG-Minimization-and-MD.sh

Workflow

a. Minimize coarse-grained system

Used tools: gmx grompp , gmx mdrun

gmx grompp generates a single .tpr file that contains all information necessary for running a MD simulation or minimization using gmx mdrun.

Minimization parameters are provided in martini_new-rf_min.mdp. The system will be minimized in 500 steps using steepest descent. The protein structure will be frozen during minimization.

Output files are stored in the martini directory.

-min.tpr - MD run input file

-min.log - Output log file from minimization

-min.trr - Minimization trajectory

-min.gro - Minimized system coordinates


b. Simulate coarse-grained system

MD simulation parameters are provided in martini_v2.x_new-rf.mdp. Strong position restraints are applied on the protein structure during the simulation. The system will be simulated for 50ns.

Output files are stored in the martini directory.

-md.tpr - MD run input file

-md.log - Output log file from simulation

-md.trr - lossless trajectory of simulation

-md.xtc - coordinates of simulation trajectory

-md.gro - coordinates of final simulation snapshot

The simulation will run for roughly 3 hours.


3) Converting coarse-grained system to atomistic resolution and select lipid atoms for grid generation

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0003-backmap-and-lpd-selection.sh

3.1) Run the script

Copy the script above to your working directory.

./0003-backmap-and-lpd-selection.sh

Workflow

a. Backmapping from coarse-grained to atomistic

Used tool: gmx trjconv, initram.sh

gmx trjconv can perform a variety of conversions of MD trajectory, e.g. making molecules broken over the periodic boundary conditions whole again.

initram.sh calls the backward.py program which performs the backmapping of input coarse-grained to atomistic systems, and performs a small series of short minimizations and simulations to relax the backmapped system.

Output files are stored in the generated backmap directory.

-0-backmapped.gro / projected.gro - initial backmapped coordinates

-backmapped.top - Topology of atomistic system

-1-EM*/2-EM* - Output from minimizations

-3-mdpr*/4-mdpr*/5-mdpr*/6-mdpr* - Output from simulations

-backmapped.gro - Coordinates of final backmapped and relaxed system

This may need a few attempts to work all the way through. There is a while-loop that only stops until all relaxation steps have finished.


b. Replacing backmapped protein with initial atomistic protein structure

Used tool: PyMOL script align.pml

The pymol script will align the initial "Gromacs"-protein structure (conf.pdb) onto the backmapped structure and combine the fitted protein coordinates with the coordinates of the lipid and solvent environment.

Output files are stored in the generated backmap/prepare_AA_system directory.

-conf-fitted.pdb - Fitted initial protein structure

-backmapped-environment.pdb - All membrane and water coordinates

-fitted_system.pdb - Complete system containing fitted protein and environment coordinates

Be sure that fitted_system.pdb has the same number of coordinates as backmapped-mol.pdb. If there is a discrepancy there might be an issue with the PyMOL version you're using to run align.pml. Using PyMOL v2 or newer seems to avoid any issues. You can also generate backmapped-environment.pdb manually by taking all POPC and Water cooridnates from backmapped-mol.pdb.

c. Run minimizations of atomistic system

Used tools: gmx grompp, gmx mdrun

Output files are stored in the backmap/prepare_AA_system directory.

Tow minimizations will be calculated.

1) Minimization with frozen protein coordinates: 1,500 steps steepest descent (min_freeze.mdp).

-min_freeze* - Output files of first minimization

2) Minimization of full system: 500 steps (min.mdp).

-min* - Output files of second minimization


d. Select lipid atoms for DOCK grid generation

Used tools: PyMOL script prepare.pml

This will select carbon and hydrogen atoms of the hydrophobic lipid tail segments in a radius of 1.7 nm around the protein and assign them to the atom type "LPD"

You need to provide the rec.pdb to want to use for docking (potentially with missing loops) at that step as xtal-prot.pdb

Output files are stored in the generated backmap/prepare_AA_system/prepare_min directory.

-shell-LPD.pdb - all LPD atoms selected for grid generation

Add these coordinates to your docking protein structure and provide a amb.crg.oxt file adding

C lpd 0.000 LIPID SPHERE

Now you can run blastermaster.

Membrane modelling in Schrodinger

Written by Andrii Kyrylchuk, 2022/04/20

MD of protein and membrane

Import structure without ligand, use Preparation wizard as described in "Code for Controls..." to model missing loops and capping.

Then open System Builder, click Setup membrane.

Go to the website https://opm.phar.umich.edu/proteins/ and find your protein. Copy residue numbers from the bottom of the page to the field Transmembrane atoms.... The format is as follows:

res.num 76-97,112-136,141,...

Click Place Automatically, OK. Then click Run. Examine lipids and solvent after run completes.

Then use Molecular Dynamics menu to set up the calculation. Select prepared system, click Load on top of the menu. Put simulation time of 5 ns, Advanced Options -- Restraints -- Add. Select protein, and put Force Constant of 100, click Apply and OK. Then click down arrow left of the Run button in the parent window and click Write.

Copy the project folder (desmond_md_job_X) to gimel, login to gimel5, edit desmond_md_job_X.sh: delete -lic DESMOND_GPGPU:16 and insert -HOST gimel5.gpu (or gimel5.heavygpu). Run .sh file, and your task will be submitted to a queue. For my system it took 1.5 hr to complete.

Download the project folder to your PC, open Maestro, click Import structure and open -out.cms, click on T icon at the new entry in project table and click Display Trajectory Snapshots. Select the last one, click Display and check if the protein did not change position during MD run, then click Export, to Project Table, Frames Selected only. You will get a new entry in the Project Table. Export it to a .pdb file.

Preparation of the structure for Blastermaster

Use prepare.pml script. You need to provide the rec.pdb to want to use for docking (potentially with missing loops) at that step as xtal-prot.pdb. Rename MD system as last-mol.pdb.

pymol -qc last-mol.pdb xtal-prot.pdb prepare.pml

align last-mol, xtal-prot

remove ////SPC

create MEM, ////POPC

remove /MEM////HS
remove /MEM////HX
remove /MEM////HY
remove /MEM////H*B
remove /MEM////H*A
remove /MEM////H*C

remove /MEM////C1
remove /MEM////C2
remove /MEM////C3
remove /MEM////C11
remove /MEM////C12
remove /MEM////C13
remove /MEM////C14
remove /MEM////C15

create MEM_C, /MEM////*C* | /MEM////H*R | /MEM////H*S | /MEM////H*T | /MEM////H*X | /MEM////H*Y | /MEM////H*Z

create shell, MEM_C within 20 of xtal-prot

set retain_order, 1

save shell.pdb, shell

alter shell, resn='LPD'
alter shell, name='C'
alter shell, chain=''

rebuild

save shell-LPD.pdb, shell

Preparation of grids with thinspheres

Prepare grids with thinspheres for the protein without lipid as described in https://wiki.docking.org/index.php/How_to_do_parameter_scanning

Create an empty directory and put shell-LPD.pdb there. Then run the following script:

sh blast-membrane-thinsph-scan.sh {path to the collection of "es_ld_thin_sph_rad_X.X" directories} {path to the dir with original working and dockfiles directories}

This script runs qnifft and solvmap for each es_ld_thin_sph_rad_X.X directory, and then uses the second script of parameter scanning protocol to combine files into dockfiles directories.

Important! LOOK AT YOUR GRIDS! Desolvation -- larger solvation where water is. Electrostatics -- no electrostatics in the lipid region. vdW -- no vdW in the lipid region.

#!/bin/bash
# a script to prepare a protein with the lipid membrane for DOCK with thinsphere scan
# PREREQ -- run first step of https://wiki.docking.org/index.php/How_to_do_parameter_scanning (new_0001_generate_ES_LD_generation.py )
# first argument -- path to the directory where dirs "es_ld_thin_sph_rad_X.X" are stored. 
# second argument -- path to the dir with original working and dockfiles
# run in a new directory with shell-LPD.pdb

run_once () {
# mkdir  add_membrane
# Run qnifft for electrostatic grids
# cp $blastfiles_membrane/shell-LPD.pdb
# delete all fields except HETATM from shell-LPD.pdb
cp $curr_dir/shell-LPD.pdb .
sed -ir '/^CRYST1/d' shell-LPD.pdb
sed -ir '/^CONECT/d' shell-LPD.pdb
sed -ir '/^END/d' shell-LPD.pdb
cp $blastermaster_Prot/working/rec.crg.pdb .
#cp $blastermaster_Prot/working/lowdielectric.sph.pdb # (if wanted, if you run a  parameter scan, be sure to use the correct spheres!)
#cp -r $blastermaster_Prot/dockfiles .
#cp -r $blastermaster_Prot/INDOCK .
mv rec.crg.pdb  rec.crg.solo.pdb
cat rec.crg.solo.pdb shell-LPD.pdb  > receptor.crg.lowdielectric.pdb
# extracting SPH lines from receptor
number=$(grep -n  SPH  $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb | head -n 1 | awk -F ":" '{print $1}')
tail --lines=+$number $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb >> receptor.crg.lowdielectric.pdb
need_files="amb.crg.oxt
qnifft.parm
vdw.siz"
for file in $need_files
do
	if [ -e $blastermaster_Prot/working/$file ]
	then
		cp $blastermaster_Prot/working/$file .
	else
		cp $blast_orig/working/$file .
	fi
done
echo "C     lpd        0.000     LIPID SPHERE" >> amb.crg.oxt
cp $blastermaster_Prot/working/box .
cp $blastermaster_Prot/working/qnifft.parm .
cp $blastermaster_Prot/working/vdw.siz .
$DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm >& qnifft.log
$DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi >& trim.log

#cp trim.electrostatics.phi dockfiles/.

# Check if the grid size changed:
#echo "Check if the grid size changed, compare this with INDOCK"
#python ~jklyu/zzz.script/pymol_movie/phi_to_dx.py trim.electrostatics.phi trim.electrostatics.phi.dx > gridsize
#head -n 1 gridsize
# compare with delphi_nsize in INDOCK

# Run solvmap for Ligand Desolvation Grids if they are present in the folder
if [ -e $blastermaster_Prot/working/heavy ]
then
	mkdir heavy
	cd heavy || exit
	cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
	cp ../box .
	cp $blastermaster_Prot/working/heavy/INSEV .
	$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
	#cp ligand.desolv.heavy ../dockfiles/.
	cd ../ || exit

	mkdir hydrogen
	cd hydrogen || exit
	cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
	cp ../box .
	cp $blastermaster_Prot/working/hydrogen/INSEV .
	$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
	#cp ligand.desolv.heavy ../dockfiles/.
fi
echo $dir " DONE!"
}

curr_dir=$(pwd)
workdirs=$1
blast_orig=$2
dirs=$(ls -d $workdirs/es_ld_thin_sph_rad_*)
for dir in $dirs
do
	blastermaster_Prot=$dir
	local_dir=$(echo $dir | awk -F"\/" '{print $NF}')
	mkdir -p $local_dir/working
	cd $local_dir/working || exit
	run_once
	cd $curr_dir || exit
done


python ~rstein/zzz.scripts/DOCK_prep_scripts/new_0002_combine_es_ld_grids_into_combos.py -p $blast_orig

Running blastermaster with default parameters

Warning! Do not use these grids, as the default grids with lipid spheres give incorrect solvation energies. Use the ones with thinspheres instead!

#!/bin/bash
# a script to prepare a protein with the lipid membrane for DOCK 
# first argument -- path to blastermaster files of the protein without membrane
# run in a new directory with shell-LPD.pdb
blastermaster_Prot=$1
# mkdir  add_membrane
# Run qnifft for electrostatic grids
# cp $blastfiles_membrane/shell-LPD.pdb
# delete all fields except HETATM from shell-LPD.pdb
sed -ir '/^CRYST1/d' shell-LPD.pdb
sed -ir '/^CONECT/d' shell-LPD.pdb
sed -ir '/^END/d' shell-LPD.pdb
cp $blastermaster_Prot/working/rec.crg.pdb .
#cp $blastermaster_Prot/working/lowdielectric.sph.pdb # (if wanted, if you run a  parameter scan, be sure to use the correct spheres!)
cp -r $blastermaster_Prot/dockfiles .
cp -r $blastermaster_Prot/INDOCK .
mv rec.crg.pdb  rec.crg.solo.pdb
cat rec.crg.solo.pdb shell-LPD.pdb  > receptor.crg.lowdielectric.pdb
# extracting SPH lines from receptor
number=$(grep -n  SPH  $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb | head -n 1 | awk -F ":" '{print $1}')
tail --lines=+$number $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb >> rec.crg.lowdielectric.pdb
cp $blastermaster_Prot/working/amb.crg.oxt .
echo "C     lpd        0.000     LIPID SPHERE" >> amb.crg.oxt
cp $blastermaster_Prot/working/box .
cp $blastermaster_Prot/working/qnifft.parm .
cp $blastermaster_Prot/working/vdw.siz .
$DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm >& qnifft.log
$DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi >& trim.log

cp trim.electrostatics.phi dockfiles/.

# Check if the grid size changed:
echo "Check if the grid size changed, compare this with INDOCK"
python ~jklyu/zzz.script/pymol_movie/phi_to_dx.py trim.electrostatics.phi trim.electrostatics.phi.dx > gridsize
head -n 1 gridsize
# compare with delphi_nsize in INDOCK

# Run solvmap for Ligand Desolvation Grids
mkdir heavy
cd heavy || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/heavy/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
cp ligand.desolv.heavy ../dockfiles/.
cd ../ || exit

mkdir hydrogen
cd hydroger || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/hydrogen/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
cp ligand.desolv.heavy ../dockfiles/.
echo "DONE!"