Written by Stefan Gahbauer, 2019/11/03

In order to account for ligand desolvation and electrostatic interactions in the low-dielectric environment of the hydrophobic membrane core, a lipid-bilayer is generated around the target receptor and included in the docking score grid generation.
Aiming at a fast, robust and computationally effective equilibration of the lipid bilayer around the embedded transmembrane receptor, coarse-grained (CG) molecular dynamics (MD) simulations and (if needed) subsequent atomistic simulations are employed.

= Required software and datasets =

'''Gromacs''' (v5 or newer) - Molecular Dynamics software package (http://manual.gromacs.org/)

'''CHARMM36m force field''' (http://mackerell.umaryland.edu/charmm_ff.shtml)

'''MARTINI''' Coarse-grained force field parameters(http://cgmartini.nl/)

'''DSSP''' - Secondary Structure assignment (https://swift.cmbi.umcn.nl/gv/dssp/ , https://anaconda.org/salilab/dssp)

'''martinize.py''' - Coarse-graining atomistic protein structures (http://cgmartini.nl/index.php/tools2/proteins-and-bilayers)

'''insane.py''' - INSerting proteins in coarse-grained MembrANE (http://cgmartini.nl/index.php/tools2/proteins-and-bilayers)

'''initram.sh''' and '''backward.py''' - Conversion of coarse-grained system to atomistic resolution (http://cgmartini.nl/index.php/tools2/resolution-transformation)

= 1) Setting up the coarse-grained system =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0001-prepare-protein-CG-membrane.sh

== 1.1) Prepare your files ==

Copy the script above to your working directory.

Copy your rec.pdb to your working directory.

If your rec.pdb has gaps, e.g. unresolved loops between transmembrane helices in case of GPCRs, try to model missing residues.

One way is to use MODELLER following https://salilab.org/modeller/wiki/Missing%20residues.

Corresponding input scripts for modeller can be found in:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/modeller

== 1.2) Run the script ==

Login to gimel2.

./0001-prepare-protein-CG-membrane.sh

The script reads rec.pdb and copies all other required files from

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/gromacs

=== Workflow ===

'''a.''' Generate CHARMM36m force field parameters of your protein in a Gromacs-readable format.

Used tool: gmx pdb2gmx

Output files are stored in the generated ''pdb2gmx'' directory

-conf.gro / conf.pdb - Gromacs coordinate file

-topol.top / Protein-atomistic.itp - Gromacs topology file, i.e. force field description of your input structure

-posre.itp - Position restraints for heavy atoms of atomistic protein strucutre.

'''b.''' Build coarse-grained structure

Used tool: martinize.py

Output files are stored in the generated ''martini'' directory.

-chain_.ssd - Output from the DSSP program that is called by martinize.py

-prot-cg.pdb - Coarse-grained protein structure

-prot-cg.top - Coarse-grained Martini topology of system

-Protein.itp - Coarse-grained Martini description of Protein structure

-prot-rot.pdb - Coarse-grained protein structure aligned along z-axis of the simulation box according to the proteins first principal component axis. This ensures the correct placement of the protein during membrane preparation. '''You may have to adjust the orientation of your input structure prior to membrane modeling.'''

'''c.''' Build coarse-grained membrane

Used tool: insane.py

Here, a lipid bilayer will be created around the protein structure (in the x/y-plane) and water will be added to the system. The default box shape is rectangular and the size is set to x,y=10nm, z=11nm. This can be changed in the ./insane.py command line. The default lipid type is POPC, you can change that to arbitrary lipid compositions using the -l and -u options of of insane.py.

Output files are stored in the ''martini'' directory.

-out.top / topol-cg.top - Topology of coarse-grained system

-cg-membrane.gro/.pdb - Coarse-grained system coordinates. '''Carefully inspect and visualize the cg-membrane.pdb.'''

Use PyMOL to check if you're protein is embedded correctly in the lipid bilayer.

= 2) Simulating the coarse-grained system =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0002-run-CG-Minimization-and-MD.sh

== 2.1) Run the script ==

Copy the script above to your working directory.

./0002-run-CG-Minimization-and-MD.sh

=== Workflow ===

'''a.''' Minimize coarse-grained system

Used tools: gmx grompp , gmx mdrun

gmx grompp generates a single .tpr file that contains all information necessary for running a MD simulation or minimization using gmx mdrun.

Minimization parameters are provided in martini_new-rf_min.mdp. The system will be minimized in 500 steps using steepest descent. The protein structure will be frozen during minimization.

Output files are stored in the ''martini'' directory.

-min.tpr - MD run input file

-min.log - Output log file from minimization

-min.trr - Minimization trajectory

-min.gro - Minimized system coordinates

'''b.''' Simulate coarse-grained system

MD simulation parameters are provided in martini_v2.x_new-rf.mdp. Strong position restraints are applied on the protein structure during the simulation. The system will be simulated for 50ns.

Output files are stored in the ''martini'' directory.

-md.tpr - MD run input file

-md.log - Output log file from simulation

-md.trr - lossless trajectory of simulation

-md.xtc - coordinates of simulation trajectory

-md.gro - coordinates of final simulation snapshot

'''The simulation will run for roughly 3 hours.'''

= 3) Converting coarse-grained system to atomistic resolution and select lipid atoms for grid generation =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0003-backmap-and-lpd-selection.sh

== 3.1) Run the script ==

Copy the script above to your working directory.

./0003-backmap-and-lpd-selection.sh

=== Workflow ===

'''a.''' Backmapping from coarse-grained to atomistic

Used tool: gmx trjconv, initram.sh

gmx trjconv can perform a variety of conversions of MD trajectory, e.g. making molecules broken over the periodic boundary conditions whole again.

initram.sh calls the backward.py program which performs the backmapping of input coarse-grained to atomistic systems, and performs a small series of short minimizations and simulations to relax the backmapped system.

Output files are stored in the generated ''backmap'' directory.

-0-backmapped.gro / projected.gro - initial backmapped coordinates

-backmapped.top - Topology of atomistic system

-1-EM*/2-EM* - Output from minimizations

-3-mdpr*/4-mdpr*/5-mdpr*/6-mdpr* - Output from simulations

-backmapped.gro - Coordinates of final backmapped and relaxed system

'''This may need a few attempts to work all the way through. There is a while-loop that only stops until all relaxation steps have finished.'''

'''b.''' Replacing backmapped protein with initial atomistic protein structure

Used tool: PyMOL script align.pml

The pymol script will align the initial "Gromacs"-protein structure (conf.pdb) onto the backmapped structure and combine the fitted protein coordinates with the coordinates of the lipid and solvent environment.

Output files are stored in the generated ''backmap/prepare_AA_system'' directory.

-conf-fitted.pdb - Fitted initial protein structure

-backmapped-environment.pdb - All membrane and water coordinates

-fitted_system.pdb - Complete system containing fitted protein and environment coordinates

'''Be sure that fitted_system.pdb has the same number of coordinates as backmapped-mol.pdb.''' If there is a discrepancy there might be an issue with the PyMOL version you're using to run align.pml. Using PyMOL v2 or newer seems to avoid any issues. You can also generate backmapped-environment.pdb manually by taking all POPC and Water cooridnates from backmapped-mol.pdb.

'''c.''' Run minimizations of atomistic system

Used tools: gmx grompp, gmx mdrun

Output files are stored in the ''backmap/prepare_AA_system'' directory.

Tow minimizations will be calculated.

1) Minimization with frozen protein coordinates: 1,500 steps steepest descent (min_freeze.mdp).

-min_freeze* - Output files of first minimization

2) Minimization of full system: 500 steps (min.mdp).

-min* - Output files of second minimization

'''d.''' Select lipid atoms for DOCK grid generation

Used tools: PyMOL script prepare.pml

This will select carbon and hydrogen atoms of the hydrophobic lipid tail segments in a radius of 1.7 nm around the protein and assign them to the atom type "LPD"

You need to provide the rec.pdb to want to use for docking (potentially with missing loops) at that step as xtal-prot.pdb

Output files are stored in the generated ''backmap/prepare_AA_system/prepare_min'' directory.

-shell-LPD.pdb - all LPD atoms selected for grid generation

Add these coordinates to your docking protein structure and provide a amb.crg.oxt file adding

C lpd 0.000 LIPID SPHERE

Now you can run blastermaster.

= Membrane modelling in Schrodinger =
Written by Andrii Kyrylchuk, 2022/04/20

== MD of protein and membrane ==
Import structure '''without''' ligand, use Preparation wizard as described in "Code for Controls..." to model missing loops and capping.

Then open System Builder, click <code>Setup membrane</code>.

Go to the website https://opm.phar.umich.edu/proteins/ and find your protein. Copy residue numbers from the bottom of the page to the field <code>Transmembrane atoms...</code>. The format is as follows:

<code>res.num 76-97,112-136,141,...</code>

Click <code>Place Automatically</code>, <code>OK</code>. Then click <code>Run</code>. Examine lipids and solvent after run completes.

Then use Molecular Dynamics menu to set up the calculation. Select prepared system, click <code>Load</code> on top of the menu. Put simulation time of 5 ns, <code>Advanced Options</code> -- <code>Restraints</code> -- <code>Add.</code> Select protein, and put <code>Force Constant</code> of 100, click <code>Apply</code> and <code>OK.</code> Then click down arrow left of the <code>Run</code> button in the parent window and click <code>Write</code>.

'''''NOTE''': In my experience, restrained MD runs with NgPT in Schrodinger often fail with the following message:''

<code>''Allowed momentum exceeded on 17 particles.''</code>

''This seems to be related to the interference of restraints and the ensemble, as no such error is observed if no restraints are imposed, or if NVT ensemble is used. Still, NgPT ensemble is important for correct membrane sampling, because simulations at NVT long enough to permit membrane relaxation, lead to the smearing of the lipid bilayer and the formation of empty space.''

''If the mentioned problem occurs, import the last coordinates of the run (*-out.cms) into Maestro, open Minimization menu, load the structure from the workspace, and apply restraints on lipid and protein (force restraint of 10 is usually enough). Then run a minimization for 100 ps, and try to run NgPT MD starting from the optimized structure.''

'''''TO CHECK''': NPT ensemble.''

Copy the project folder (desmond_md_job_X) to gimel, login to gimel5, edit desmond_md_job_X.sh: delete <code>-lic DESMOND_GPGPU:16</code> and insert <code>-HOST gimel5.gpu</code> (or <code>gimel5.heavygpu</code>). Export <code>$SCHRODINGER</code> variable:

<code>SCHRODINGER=/nfs/soft2/schrodinger/2023-3/</code> (or a newer version)

Run .sh file, and your task will be submitted to a queue. For my system it took 1.5 hr to complete.

Download the project folder to your PC, open Maestro, click <code>Import structure</code> and open <code>-out.cms</code>, click on <code>T</code> icon at the new entry in project table and click <code>Display Trajectory Snapshots</code>. Select the last one, click <code>Display</code> and check if the protein did not change position during MD run, then click <code>Export</code>, to Project Table, Frames Selected only. You will get a new entry in the Project Table. Export it to a <code>.pdb</code> file.

== Preparation of the structure for Blastermaster ==
Use <code>prepare.pml</code> script. You need to provide the <code>rec.pdb</code> to want to use for docking (potentially with missing loops) at that step as <code>xtal-prot.pdb</code>. Rename MD system as <code>last-mol.pdb</code>.

<code>pymol -qc last-mol.pdb xtal-prot.pdb prepare.pml</code>

<syntaxhighlight lang="python">
align last-mol, xtal-prot

remove ////SPC

create MEM, ////POPC

# these atom numbers do not exist in POPC or DPPC
# therefore, we do not remove protons from the lipid
# structure to make more spheres. Uncomment these
# lines and change to proper H numbers if needed.
#remove /MEM////HS
#remove /MEM////HX
#remove /MEM////HY
#remove /MEM////H*B
#remove /MEM////H*A
#remove /MEM////H*C

remove /MEM////C1
remove /MEM////C2
remove /MEM////C3
remove /MEM////C11
remove /MEM////C12
remove /MEM////C13
remove /MEM////C14
remove /MEM////C15

create MEM_C, /MEM////*C* | /MEM////H*R | /MEM////H*S | /MEM////H*T | /MEM////H*X | /MEM////H*Y | /MEM////H*Z

create shell, MEM_C within 20 of xtal-prot

set retain_order, 1

save shell.pdb, shell

alter shell, resn='LPD'
alter shell, name='C'
alter shell, chain=''

rebuild

save shell-LPD.pdb, shell
</syntaxhighlight>

== Preparation of grids with thinspheres ==
Prepare grids with thinspheres for the protein without lipid as described in https://wiki.docking.org/index.php/How_to_do_parameter_scanning

Create an empty directory and put <code>shell-LPD.pdb</code> there. Then run the following script:

<code>sh blast-membrane-thinsph-scan.sh {path to the collection of "es_ld_thin_sph_rad_X.X" directories} {path to the dir with original working and dockfiles directories}</code>

This script runs qnifft and solvmap for each <code>es_ld_thin_sph_rad_X.X</code> directory, and then uses the second script of parameter scanning protocol to combine files into <code>dockfiles</code> directories.

'''Important! LOOK AT YOUR GRIDS! Desolvation -- larger solvation where water is. Electrostatics -- no electrostatics in the lipid region. vdW -- no vdW in the lipid region.'''

<syntaxhighlight lang="bash">
#!/bin/bash
# a script to prepare a protein with the lipid membrane for DOCK with thinsphere scan
# PREREQ -- run first step of https://wiki.docking.org/index.php/How_to_do_parameter_scanning (new_0001_generate_ES_LD_generation.py )
# first argument -- path to the directory where dirs "es_ld_thin_sph_rad_X.X" are stored.
# second argument -- path to the dir with original working and dockfiles
# run in a new directory with shell-LPD.pdb

run_once () {
# mkdir add_membrane
# Run qnifft for electrostatic grids
# cp $blastfiles_membrane/shell-LPD.pdb
# delete all fields except HETATM from shell-LPD.pdb
cp $curr_dir/shell-LPD.pdb .
sed -ir '/^CRYST1/d' shell-LPD.pdb
sed -ir '/^CONECT/d' shell-LPD.pdb
sed -ir '/^END/d' shell-LPD.pdb
cp $blastermaster_Prot/working/rec.crg.pdb .
#cp $blastermaster_Prot/working/lowdielectric.sph.pdb # (if wanted, if you run a parameter scan, be sure to use the correct spheres!)
#cp -r $blastermaster_Prot/dockfiles .
#cp -r $blastermaster_Prot/INDOCK .
mv rec.crg.pdb rec.crg.solo.pdb
cat rec.crg.solo.pdb shell-LPD.pdb > receptor.crg.lowdielectric.pdb
# extracting SPH lines from receptor
number=$(grep -n SPH $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb | head -n 1 | awk -F ":" '{print $1}')
tail --lines=+$number $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb >> receptor.crg.lowdielectric.pdb
need_files="amb.crg.oxt
qnifft.parm
vdw.siz"
for file in $need_files
do
if [ -e $blastermaster_Prot/working/$file ]
then
cp $blastermaster_Prot/working/$file .
else
cp $blast_orig/working/$file .
fi
done
echo "C lpd 0.000 LIPID SPHERE" >> amb.crg.oxt
cp $blastermaster_Prot/working/box .
cp $blastermaster_Prot/working/qnifft.parm .
cp $blastermaster_Prot/working/vdw.siz .
$DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm >& qnifft.log
$DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi >& trim.log

#cp trim.electrostatics.phi dockfiles/.

# Check if the grid size changed:
#echo "Check if the grid size changed, compare this with INDOCK"
#python ~jklyu/zzz.script/pymol_movie/phi_to_dx.py trim.electrostatics.phi trim.electrostatics.phi.dx > gridsize
#head -n 1 gridsize
# compare with delphi_nsize in INDOCK

# Run solvmap for Ligand Desolvation Grids if they are present in the folder
if [ -e $blastermaster_Prot/working/heavy ]
then
mkdir heavy
cd heavy || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/heavy/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
#cp ligand.desolv.heavy ../dockfiles/.
cd ../ || exit

mkdir hydrogen
cd hydrogen || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/hydrogen/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
#cp ligand.desolv.heavy ../dockfiles/.
fi
echo $dir " DONE!"
}

curr_dir=$(pwd)
workdirs=$1
blast_orig=$2
dirs=$(ls -d $workdirs/es_ld_thin_sph_rad_*)
for dir in $dirs
do
blastermaster_Prot=$dir
local_dir=$(echo $dir | awk -F"\/" '{print $NF}')
mkdir -p $local_dir/working
cd $local_dir/working || exit
run_once
cd $curr_dir || exit
done

python ~rstein/zzz.scripts/DOCK_prep_scripts/new_0002_combine_es_ld_grids_into_combos.py -p $blast_orig

</syntaxhighlight>

== Running blastermaster with default parameters ==
'''Warning! Do not use these grids, as the default grids with lipid spheres give incorrect solvation energies. Use the ones with thinspheres instead!'''
<syntaxhighlight lang="bash">
#!/bin/bash
# a script to prepare a protein with the lipid membrane for DOCK
# first argument -- path to blastermaster files of the protein without membrane
# run in a new directory with shell-LPD.pdb
blastermaster_Prot=$1
# mkdir add_membrane
# Run qnifft for electrostatic grids
# cp $blastfiles_membrane/shell-LPD.pdb
# delete all fields except HETATM from shell-LPD.pdb
sed -ir '/^CRYST1/d' shell-LPD.pdb
sed -ir '/^CONECT/d' shell-LPD.pdb
sed -ir '/^END/d' shell-LPD.pdb
cp $blastermaster_Prot/working/rec.crg.pdb .
#cp $blastermaster_Prot/working/lowdielectric.sph.pdb # (if wanted, if you run a parameter scan, be sure to use the correct spheres!)
cp -r $blastermaster_Prot/dockfiles .
cp -r $blastermaster_Prot/INDOCK .
mv rec.crg.pdb rec.crg.solo.pdb
cat rec.crg.solo.pdb shell-LPD.pdb > receptor.crg.lowdielectric.pdb
# extracting SPH lines from receptor
number=$(grep -n SPH $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb | head -n 1 | awk -F ":" '{print $1}')
tail --lines=+$number $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb >> rec.crg.lowdielectric.pdb
cp $blastermaster_Prot/working/amb.crg.oxt .
echo "C lpd 0.000 LIPID SPHERE" >> amb.crg.oxt
cp $blastermaster_Prot/working/box .
cp $blastermaster_Prot/working/qnifft.parm .
cp $blastermaster_Prot/working/vdw.siz .
$DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm >& qnifft.log
$DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi >& trim.log

cp trim.electrostatics.phi dockfiles/.

# Check if the grid size changed:
echo "Check if the grid size changed, compare this with INDOCK"
python ~jklyu/zzz.script/pymol_movie/phi_to_dx.py trim.electrostatics.phi trim.electrostatics.phi.dx > gridsize
head -n 1 gridsize
# compare with delphi_nsize in INDOCK

# Run solvmap for Ligand Desolvation Grids
mkdir heavy
cd heavy || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/heavy/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
cp ligand.desolv.heavy ../dockfiles/.
cd ../ || exit

mkdir hydrogen
cd hydroger || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/hydrogen/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
cp ligand.desolv.heavy ../dockfiles/.
echo "DONE!"
</syntaxhighlight>

== Lipid membrane models from MemProtMD ==
If a protein-membrane complex was already modeled for your system and deposited at [http://memprotmd.bioch.ox.ac.uk/home/ MemProtMD] website, you can use it and skip doing MD in Schrodinger. The steps are very similar to the ones after Schrodinger run.

Download the <code>*_default_dppc.mpmd.finalframe.atomistic.pdb</code> file from the bottom of the page. Rename it to <code>last-mol.pdb</code>.

Use prepare.pml script (below). You need to provide the <code>rec.pdb</code> to want to use for docking (potentially with missing loops) at that step as <code>xtal-prot.pdb</code>.

<code>pymol -qc last-mol.pdb xtal-prot.pdb prepare.pml</code>

The script below differs from the one for processing Schrodinger results in two points: solvent residue is <code>SOL</code> instead of <code>SPC</code>, and lipids are called <code>DPPC</code> instead of <code>POPC</code>.<syntaxhighlight lang="python">
align last-mol, xtal-prot

remove ////SOL

create MEM, ////DPPC

remove /MEM////HS
remove /MEM////HX
remove /MEM////HY
remove /MEM////H*B
remove /MEM////H*A
remove /MEM////H*C

remove /MEM////C1
remove /MEM////C2
remove /MEM////C3
remove /MEM////C11
remove /MEM////C12
remove /MEM////C13
remove /MEM////C14
remove /MEM////C15

create MEM_C, /MEM////*C* | /MEM////H*R | /MEM////H*S | /MEM////H*T | /MEM////H*X | /MEM////H*Y | /MEM////H*Z

create shell, MEM_C within 20 of xtal-prot

set retain_order, 1

save shell.pdb, shell

alter shell, resn='LPD'
alter shell, name='C'
alter shell, chain=''

rebuild

save shell-LPD.pdb, shell
</syntaxhighlight>

== Positioning of the membrane ==
If there is no precomputed membrane position from OPM or MemProtMD, you can model it using PPM 3.0 webserver (https://opm.phar.umich.edu/ppm_server3) or standalone software. The software is installed in <code>/nfs/home/ak87/exa/PROGRAM/OPM-MEMBRANE</code>. Copy <code>res.lib</code> file from the program directory to the directory with your .pdb file. Create input file like this:<syntaxhighlight lang="shell">
1
0 PMm out rec.pdb
</syntaxhighlight><blockquote>0 or 1 -“do not use” or “use” heteroatoms in the input PDB file, respectively (solvent molecules are always excluded).

MOM - type of membrane (see list of 3-letter codes for membranes below)

“in” or “out” means topology of N-terminus of first subunit included in the corresponding input PDB file

With this option, for every input pdb file, the program will selected automatically the flat or curved membrane boundaries, whichever had the lower calculated transfer energy.</blockquote>Then run the program:

<code>~ak87/exa/PROGRAM/OPM-MEMBRANE/immers<1membrane.inp>rec-opm.out</code>

Use <code>datasub1</code> file to extract residue numbers for the membrane placement in Maestro.
See ppm3_instructions.docx file in the program directory for more detail.

Membrane Modeling

2023-01-10T17:59:39Z

Iamkaant:

Written by Stefan Gahbauer, 2019/11/03

In order to account for ligand desolvation and electrostatic interactions in the low-dielectric environment of the hydrophobic membrane core, a lipid-bilayer is generated around the target receptor and included in the docking score grid generation.
Aiming at a fast, robust and computationally effective equilibration of the lipid bilayer around the embedded transmembrane receptor, coarse-grained (CG) molecular dynamics (MD) simulations and (if needed) subsequent atomistic simulations are employed.

= Required software and datasets =

'''Gromacs''' (v5 or newer) - Molecular Dynamics software package (http://manual.gromacs.org/)

'''CHARMM36m force field''' (http://mackerell.umaryland.edu/charmm_ff.shtml)

'''MARTINI''' Coarse-grained force field parameters(http://cgmartini.nl/)

'''DSSP''' - Secondary Structure assignment (https://swift.cmbi.umcn.nl/gv/dssp/ , https://anaconda.org/salilab/dssp)

'''martinize.py''' - Coarse-graining atomistic protein structures (http://cgmartini.nl/index.php/tools2/proteins-and-bilayers)

'''insane.py''' - INSerting proteins in coarse-grained MembrANE (http://cgmartini.nl/index.php/tools2/proteins-and-bilayers)

'''initram.sh''' and '''backward.py''' - Conversion of coarse-grained system to atomistic resolution (http://cgmartini.nl/index.php/tools2/resolution-transformation)

= 1) Setting up the coarse-grained system =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0001-prepare-protein-CG-membrane.sh

== 1.1) Prepare your files ==

Copy the script above to your working directory.

Copy your rec.pdb to your working directory.

If your rec.pdb has gaps, e.g. unresolved loops between transmembrane helices in case of GPCRs, try to model missing residues.

One way is to use MODELLER following https://salilab.org/modeller/wiki/Missing%20residues.

Corresponding input scripts for modeller can be found in:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/modeller

== 1.2) Run the script ==

Login to gimel2.

./0001-prepare-protein-CG-membrane.sh

The script reads rec.pdb and copies all other required files from

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/gromacs

=== Workflow ===

'''a.''' Generate CHARMM36m force field parameters of your protein in a Gromacs-readable format.

Used tool: gmx pdb2gmx

Output files are stored in the generated ''pdb2gmx'' directory

-conf.gro / conf.pdb - Gromacs coordinate file

-topol.top / Protein-atomistic.itp - Gromacs topology file, i.e. force field description of your input structure

-posre.itp - Position restraints for heavy atoms of atomistic protein strucutre.

'''b.''' Build coarse-grained structure

Used tool: martinize.py

Output files are stored in the generated ''martini'' directory.

-chain_.ssd - Output from the DSSP program that is called by martinize.py

-prot-cg.pdb - Coarse-grained protein structure

-prot-cg.top - Coarse-grained Martini topology of system

-Protein.itp - Coarse-grained Martini description of Protein structure

-prot-rot.pdb - Coarse-grained protein structure aligned along z-axis of the simulation box according to the proteins first principal component axis. This ensures the correct placement of the protein during membrane preparation. '''You may have to adjust the orientation of your input structure prior to membrane modeling.'''

'''c.''' Build coarse-grained membrane

Used tool: insane.py

Here, a lipid bilayer will be created around the protein structure (in the x/y-plane) and water will be added to the system. The default box shape is rectangular and the size is set to x,y=10nm, z=11nm. This can be changed in the ./insane.py command line. The default lipid type is POPC, you can change that to arbitrary lipid compositions using the -l and -u options of of insane.py.

Output files are stored in the ''martini'' directory.

-out.top / topol-cg.top - Topology of coarse-grained system

-cg-membrane.gro/.pdb - Coarse-grained system coordinates. '''Carefully inspect and visualize the cg-membrane.pdb.'''

Use PyMOL to check if you're protein is embedded correctly in the lipid bilayer.

= 2) Simulating the coarse-grained system =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0002-run-CG-Minimization-and-MD.sh

== 2.1) Run the script ==

Copy the script above to your working directory.

./0002-run-CG-Minimization-and-MD.sh

=== Workflow ===

'''a.''' Minimize coarse-grained system

Used tools: gmx grompp , gmx mdrun

gmx grompp generates a single .tpr file that contains all information necessary for running a MD simulation or minimization using gmx mdrun.

Minimization parameters are provided in martini_new-rf_min.mdp. The system will be minimized in 500 steps using steepest descent. The protein structure will be frozen during minimization.

Output files are stored in the ''martini'' directory.

-min.tpr - MD run input file

-min.log - Output log file from minimization

-min.trr - Minimization trajectory

-min.gro - Minimized system coordinates

'''b.''' Simulate coarse-grained system

MD simulation parameters are provided in martini_v2.x_new-rf.mdp. Strong position restraints are applied on the protein structure during the simulation. The system will be simulated for 50ns.

Output files are stored in the ''martini'' directory.

-md.tpr - MD run input file

-md.log - Output log file from simulation

-md.trr - lossless trajectory of simulation

-md.xtc - coordinates of simulation trajectory

-md.gro - coordinates of final simulation snapshot

'''The simulation will run for roughly 3 hours.'''

= 3) Converting coarse-grained system to atomistic resolution and select lipid atoms for grid generation =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0003-backmap-and-lpd-selection.sh

== 3.1) Run the script ==

Copy the script above to your working directory.

./0003-backmap-and-lpd-selection.sh

=== Workflow ===

'''a.''' Backmapping from coarse-grained to atomistic

Used tool: gmx trjconv, initram.sh

gmx trjconv can perform a variety of conversions of MD trajectory, e.g. making molecules broken over the periodic boundary conditions whole again.

initram.sh calls the backward.py program which performs the backmapping of input coarse-grained to atomistic systems, and performs a small series of short minimizations and simulations to relax the backmapped system.

Output files are stored in the generated ''backmap'' directory.

-0-backmapped.gro / projected.gro - initial backmapped coordinates

-backmapped.top - Topology of atomistic system

-1-EM*/2-EM* - Output from minimizations

-3-mdpr*/4-mdpr*/5-mdpr*/6-mdpr* - Output from simulations

-backmapped.gro - Coordinates of final backmapped and relaxed system

'''This may need a few attempts to work all the way through. There is a while-loop that only stops until all relaxation steps have finished.'''

'''b.''' Replacing backmapped protein with initial atomistic protein structure

Used tool: PyMOL script align.pml

The pymol script will align the initial "Gromacs"-protein structure (conf.pdb) onto the backmapped structure and combine the fitted protein coordinates with the coordinates of the lipid and solvent environment.

Output files are stored in the generated ''backmap/prepare_AA_system'' directory.

-conf-fitted.pdb - Fitted initial protein structure

-backmapped-environment.pdb - All membrane and water coordinates

-fitted_system.pdb - Complete system containing fitted protein and environment coordinates

'''Be sure that fitted_system.pdb has the same number of coordinates as backmapped-mol.pdb.''' If there is a discrepancy there might be an issue with the PyMOL version you're using to run align.pml. Using PyMOL v2 or newer seems to avoid any issues. You can also generate backmapped-environment.pdb manually by taking all POPC and Water cooridnates from backmapped-mol.pdb.

'''c.''' Run minimizations of atomistic system

Used tools: gmx grompp, gmx mdrun

Output files are stored in the ''backmap/prepare_AA_system'' directory.

Tow minimizations will be calculated.

1) Minimization with frozen protein coordinates: 1,500 steps steepest descent (min_freeze.mdp).

-min_freeze* - Output files of first minimization

2) Minimization of full system: 500 steps (min.mdp).

-min* - Output files of second minimization

'''d.''' Select lipid atoms for DOCK grid generation

Used tools: PyMOL script prepare.pml

This will select carbon and hydrogen atoms of the hydrophobic lipid tail segments in a radius of 1.7 nm around the protein and assign them to the atom type "LPD"

You need to provide the rec.pdb to want to use for docking (potentially with missing loops) at that step as xtal-prot.pdb

Output files are stored in the generated ''backmap/prepare_AA_system/prepare_min'' directory.

-shell-LPD.pdb - all LPD atoms selected for grid generation

Add these coordinates to your docking protein structure and provide a amb.crg.oxt file adding

C lpd 0.000 LIPID SPHERE

Now you can run blastermaster.

= Membrane modelling in Schrodinger =
Written by Andrii Kyrylchuk, 2022/04/20

== MD of protein and membrane ==
Import structure '''without''' ligand, use Preparation wizard as described in "Code for Controls..." to model missing loops and capping.

Then open System Builder, click <code>Setup membrane</code>.

Go to the website https://opm.phar.umich.edu/proteins/ and find your protein. Copy residue numbers from the bottom of the page to the field <code>Transmembrane atoms...</code>. The format is as follows:

<code>res.num 76-97,112-136,141,...</code>

Click <code>Place Automatically</code>, <code>OK</code>. Then click <code>Run</code>. Examine lipids and solvent after run completes.

Then use Molecular Dynamics menu to set up the calculation. Select prepared system, click <code>Load</code> on top of the menu. Put simulation time of 5 ns, <code>Advanced Options</code> -- <code>Restraints</code> -- <code>Add.</code> Select protein, and put <code>Force Constant</code> of 100, click <code>Apply</code> and <code>OK.</code> Then click down arrow left of the <code>Run</code> button in the parent window and click <code>Write</code>.

'''''NOTE''': In my experience, restrained MD runs with NgPT in Schrodinger often fail with the following message:''

<code>''Allowed momentum exceeded on 17 particles.''</code>

''This seems to be related to the interference of restraints and the ensemble, as no such error is observed if no restraints are imposed, or if NVT ensemble is used. Still, NgPT ensemble is important for correct membrane sampling, because simulations at NVT long enough to permit membrane relaxation, lead to the smearing of the lipid bilayer and the formation of empty space.''

''If the mentioned problem occurs, import the last coordinates of the run (*-out.cms) into Maestro, open Minimization menu, load the structure from the workspace, and apply restraints on lipid and protein (force restraint of 10 is usually enough). Then run a minimization for 100 ps, and try to run NgPT MD starting from the optimized structure.''

'''''TO CHECK''': NPT ensemble.''

Copy the project folder (desmond_md_job_X) to gimel, login to gimel5, edit desmond_md_job_X.sh: delete <code>-lic DESMOND_GPGPU:16</code> and insert <code>-HOST gimel5.gpu</code> (or <code>gimel5.heavygpu</code>). Export <code>$SCHRODINGER</code> variable:

<code>SCHRODINGER=/nfs/soft2/schrodinger/2023-3/</code> (or a newer version)

Run .sh file, and your task will be submitted to a queue. For my system it took 1.5 hr to complete.

Download the project folder to your PC, open Maestro, click <code>Import structure</code> and open <code>-out.cms</code>, click on <code>T</code> icon at the new entry in project table and click <code>Display Trajectory Snapshots</code>. Select the last one, click <code>Display</code> and check if the protein did not change position during MD run, then click <code>Export</code>, to Project Table, Frames Selected only. You will get a new entry in the Project Table. Export it to a <code>.pdb</code> file.

== Preparation of the structure for Blastermaster ==
Use <code>prepare.pml</code> script. You need to provide the <code>rec.pdb</code> to want to use for docking (potentially with missing loops) at that step as <code>xtal-prot.pdb</code>. Rename MD system as <code>last-mol.pdb</code>.

<code>pymol -qc last-mol.pdb xtal-prot.pdb prepare.pml</code>

<syntaxhighlight lang="python">
align last-mol, xtal-prot

remove ////SPC

create MEM, ////POPC

# these atom numbers do not exist in POPC or DPPC
# therefore, we do not remove protons from the lipid
# structure to make more spheres. Uncomment these
# lines and change to proper H numbers if needed.
#remove /MEM////HS
#remove /MEM////HX
#remove /MEM////HY
#remove /MEM////H*B
#remove /MEM////H*A
#remove /MEM////H*C

remove /MEM////C1
remove /MEM////C2
remove /MEM////C3
remove /MEM////C11
remove /MEM////C12
remove /MEM////C13
remove /MEM////C14
remove /MEM////C15

create MEM_C, /MEM////*C* | /MEM////H*R | /MEM////H*S | /MEM////H*T | /MEM////H*X | /MEM////H*Y | /MEM////H*Z

create shell, MEM_C within 20 of xtal-prot

set retain_order, 1

save shell.pdb, shell

alter shell, resn='LPD'
alter shell, name='C'
alter shell, chain=''

rebuild

save shell-LPD.pdb, shell
</syntaxhighlight>

== Preparation of grids with thinspheres ==
Prepare grids with thinspheres for the protein without lipid as described in https://wiki.docking.org/index.php/How_to_do_parameter_scanning

Create an empty directory and put <code>shell-LPD.pdb</code> there. Then run the following script:

<code>sh blast-membrane-thinsph-scan.sh {path to the collection of "es_ld_thin_sph_rad_X.X" directories} {path to the dir with original working and dockfiles directories}</code>

This script runs qnifft and solvmap for each <code>es_ld_thin_sph_rad_X.X</code> directory, and then uses the second script of parameter scanning protocol to combine files into <code>dockfiles</code> directories.

'''Important! LOOK AT YOUR GRIDS! Desolvation -- larger solvation where water is. Electrostatics -- no electrostatics in the lipid region. vdW -- no vdW in the lipid region.'''

<syntaxhighlight lang="bash">
#!/bin/bash
# a script to prepare a protein with the lipid membrane for DOCK with thinsphere scan
# PREREQ -- run first step of https://wiki.docking.org/index.php/How_to_do_parameter_scanning (new_0001_generate_ES_LD_generation.py )
# first argument -- path to the directory where dirs "es_ld_thin_sph_rad_X.X" are stored.
# second argument -- path to the dir with original working and dockfiles
# run in a new directory with shell-LPD.pdb

run_once () {
# mkdir add_membrane
# Run qnifft for electrostatic grids
# cp $blastfiles_membrane/shell-LPD.pdb
# delete all fields except HETATM from shell-LPD.pdb
cp $curr_dir/shell-LPD.pdb .
sed -ir '/^CRYST1/d' shell-LPD.pdb
sed -ir '/^CONECT/d' shell-LPD.pdb
sed -ir '/^END/d' shell-LPD.pdb
cp $blastermaster_Prot/working/rec.crg.pdb .
#cp $blastermaster_Prot/working/lowdielectric.sph.pdb # (if wanted, if you run a parameter scan, be sure to use the correct spheres!)
#cp -r $blastermaster_Prot/dockfiles .
#cp -r $blastermaster_Prot/INDOCK .
mv rec.crg.pdb rec.crg.solo.pdb
cat rec.crg.solo.pdb shell-LPD.pdb > receptor.crg.lowdielectric.pdb
# extracting SPH lines from receptor
number=$(grep -n SPH $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb | head -n 1 | awk -F ":" '{print $1}')
tail --lines=+$number $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb >> receptor.crg.lowdielectric.pdb
need_files="amb.crg.oxt
qnifft.parm
vdw.siz"
for file in $need_files
do
if [ -e $blastermaster_Prot/working/$file ]
then
cp $blastermaster_Prot/working/$file .
else
cp $blast_orig/working/$file .
fi
done
echo "C lpd 0.000 LIPID SPHERE" >> amb.crg.oxt
cp $blastermaster_Prot/working/box .
cp $blastermaster_Prot/working/qnifft.parm .
cp $blastermaster_Prot/working/vdw.siz .
$DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm >& qnifft.log
$DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi >& trim.log

#cp trim.electrostatics.phi dockfiles/.

# Check if the grid size changed:
#echo "Check if the grid size changed, compare this with INDOCK"
#python ~jklyu/zzz.script/pymol_movie/phi_to_dx.py trim.electrostatics.phi trim.electrostatics.phi.dx > gridsize
#head -n 1 gridsize
# compare with delphi_nsize in INDOCK

# Run solvmap for Ligand Desolvation Grids if they are present in the folder
if [ -e $blastermaster_Prot/working/heavy ]
then
mkdir heavy
cd heavy || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/heavy/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
#cp ligand.desolv.heavy ../dockfiles/.
cd ../ || exit

mkdir hydrogen
cd hydrogen || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/hydrogen/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
#cp ligand.desolv.heavy ../dockfiles/.
fi
echo $dir " DONE!"
}

curr_dir=$(pwd)
workdirs=$1
blast_orig=$2
dirs=$(ls -d $workdirs/es_ld_thin_sph_rad_*)
for dir in $dirs
do
blastermaster_Prot=$dir
local_dir=$(echo $dir | awk -F"\/" '{print $NF}')
mkdir -p $local_dir/working
cd $local_dir/working || exit
run_once
cd $curr_dir || exit
done

python ~rstein/zzz.scripts/DOCK_prep_scripts/new_0002_combine_es_ld_grids_into_combos.py -p $blast_orig

</syntaxhighlight>

== Running blastermaster with default parameters ==
'''Warning! Do not use these grids, as the default grids with lipid spheres give incorrect solvation energies. Use the ones with thinspheres instead!'''
<syntaxhighlight lang="bash">
#!/bin/bash
# a script to prepare a protein with the lipid membrane for DOCK
# first argument -- path to blastermaster files of the protein without membrane
# run in a new directory with shell-LPD.pdb
blastermaster_Prot=$1
# mkdir add_membrane
# Run qnifft for electrostatic grids
# cp $blastfiles_membrane/shell-LPD.pdb
# delete all fields except HETATM from shell-LPD.pdb
sed -ir '/^CRYST1/d' shell-LPD.pdb
sed -ir '/^CONECT/d' shell-LPD.pdb
sed -ir '/^END/d' shell-LPD.pdb
cp $blastermaster_Prot/working/rec.crg.pdb .
#cp $blastermaster_Prot/working/lowdielectric.sph.pdb # (if wanted, if you run a parameter scan, be sure to use the correct spheres!)
cp -r $blastermaster_Prot/dockfiles .
cp -r $blastermaster_Prot/INDOCK .
mv rec.crg.pdb rec.crg.solo.pdb
cat rec.crg.solo.pdb shell-LPD.pdb > receptor.crg.lowdielectric.pdb
# extracting SPH lines from receptor
number=$(grep -n SPH $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb | head -n 1 | awk -F ":" '{print $1}')
tail --lines=+$number $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb >> rec.crg.lowdielectric.pdb
cp $blastermaster_Prot/working/amb.crg.oxt .
echo "C lpd 0.000 LIPID SPHERE" >> amb.crg.oxt
cp $blastermaster_Prot/working/box .
cp $blastermaster_Prot/working/qnifft.parm .
cp $blastermaster_Prot/working/vdw.siz .
$DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm >& qnifft.log
$DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi >& trim.log

cp trim.electrostatics.phi dockfiles/.

# Check if the grid size changed:
echo "Check if the grid size changed, compare this with INDOCK"
python ~jklyu/zzz.script/pymol_movie/phi_to_dx.py trim.electrostatics.phi trim.electrostatics.phi.dx > gridsize
head -n 1 gridsize
# compare with delphi_nsize in INDOCK

# Run solvmap for Ligand Desolvation Grids
mkdir heavy
cd heavy || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/heavy/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
cp ligand.desolv.heavy ../dockfiles/.
cd ../ || exit

mkdir hydrogen
cd hydroger || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/hydrogen/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
cp ligand.desolv.heavy ../dockfiles/.
echo "DONE!"
</syntaxhighlight>

== Lipid membrane models from MemProtMD ==
If a protein-membrane complex was already modeled for your system and deposited at [http://memprotmd.bioch.ox.ac.uk/home/ MemProtMD] website, you can use it and skip doing MD in Schrodinger. The steps are very similar to the ones after Schrodinger run.

Download the <code>*_default_dppc.mpmd.finalframe.atomistic.pdb</code> file from the bottom of the page. Rename it to <code>last-mol.pdb</code>.

Use prepare.pml script (below). You need to provide the <code>rec.pdb</code> to want to use for docking (potentially with missing loops) at that step as <code>xtal-prot.pdb</code>.

<code>pymol -qc last-mol.pdb xtal-prot.pdb prepare.pml</code>

The script below differs from the one for processing Schrodinger results in two points: solvent residue is <code>SOL</code> instead of <code>SPC</code>, and lipids are called <code>DPPC</code> instead of <code>POPC</code>.<syntaxhighlight lang="python">
align last-mol, xtal-prot

remove ////SOL

create MEM, ////DPPC

remove /MEM////HS
remove /MEM////HX
remove /MEM////HY
remove /MEM////H*B
remove /MEM////H*A
remove /MEM////H*C

remove /MEM////C1
remove /MEM////C2
remove /MEM////C3
remove /MEM////C11
remove /MEM////C12
remove /MEM////C13
remove /MEM////C14
remove /MEM////C15

create MEM_C, /MEM////*C* | /MEM////H*R | /MEM////H*S | /MEM////H*T | /MEM////H*X | /MEM////H*Y | /MEM////H*Z

create shell, MEM_C within 20 of xtal-prot

set retain_order, 1

save shell.pdb, shell

alter shell, resn='LPD'
alter shell, name='C'
alter shell, chain=''

rebuild

save shell-LPD.pdb, shell
</syntaxhighlight>

== Positioning of the membrane ==
If there is no precomputed membrane position from OPM or MemProtMD, you can model it using PPM 3.0 webserver (https://opm.phar.umich.edu/ppm_server3) or standalone software. The software is installed in <code>/nfs/home/ak87/exa/PROGRAM/OPM-MEMBRANE</code>. Copy <code>res.lib</code> file from the program directory to the directory with your .pdb file. Create input file like this:<syntaxhighlight lang="shell">
1
0 PMm out rec.pdb
</syntaxhighlight><blockquote>0 or 1 -“do not use” or “use” heteroatoms in the input PDB file, respectively (solvent molecules are always excluded).

MOM - type of membrane (see list of 3-letter codes for membranes below)

“in” or “out” means topology of N-terminus of first subunit included in the corresponding input PDB file

With this option, for every input pdb file, the program will selected automatically the flat or curved membrane boundaries, whichever had the lower calculated transfer energy.</blockquote>Then run the program:

<code>~ak87/exa/PROGRAM/OPM-MEMBRANE/immers<1membrane.inp>rec-opm.out</code>

Use <code>datasub1</code> file to extract residue numbers for the membrane placement in Maestro.
See ppm3_instructions.docx file in the program directory for more detail. Test

User:Iamkaant

2023-01-10T17:58:54Z

Iamkaant:

My name is Andrii Kyrylchuk and I’m a PhD in organic chemistry. My research work began with quantum chemistry; then I’ve devoted almost 7 years to organic synthesis and then returned to computations. Now my work is mainly focused on quantum chemistry & molecular docking, but I’m also interested in the different topics of organic chemistry, general chemistry, NMR, biochemistry, IT, astronomy etc.

2022-06-14T02:51:31Z

Iamkaant: added troubleshooting of MD runs

Written by Stefan Gahbauer, 2019/11/03

In order to account for ligand desolvation and electrostatic interactions in the low-dielectric environment of the hydrophobic membrane core, a lipid-bilayer is generated around the target receptor and included in the docking score grid generation.
Aiming at a fast, robust and computationally effective equilibration of the lipid bilayer around the embedded transmembrane receptor, coarse-grained (CG) molecular dynamics (MD) simulations and (if needed) subsequent atomistic simulations are employed.

= Required software and datasets =

'''Gromacs''' (v5 or newer) - Molecular Dynamics software package (http://manual.gromacs.org/)

'''CHARMM36m force field''' (http://mackerell.umaryland.edu/charmm_ff.shtml)

'''MARTINI''' Coarse-grained force field parameters(http://cgmartini.nl/)

'''DSSP''' - Secondary Structure assignment (https://swift.cmbi.umcn.nl/gv/dssp/ , https://anaconda.org/salilab/dssp)

'''martinize.py''' - Coarse-graining atomistic protein structures (http://cgmartini.nl/index.php/tools2/proteins-and-bilayers)

'''insane.py''' - INSerting proteins in coarse-grained MembrANE (http://cgmartini.nl/index.php/tools2/proteins-and-bilayers)

'''initram.sh''' and '''backward.py''' - Conversion of coarse-grained system to atomistic resolution (http://cgmartini.nl/index.php/tools2/resolution-transformation)

= 1) Setting up the coarse-grained system =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0001-prepare-protein-CG-membrane.sh

== 1.1) Prepare your files ==

Copy the script above to your working directory.

Copy your rec.pdb to your working directory.

If your rec.pdb has gaps, e.g. unresolved loops between transmembrane helices in case of GPCRs, try to model missing residues.

One way is to use MODELLER following https://salilab.org/modeller/wiki/Missing%20residues.

Corresponding input scripts for modeller can be found in:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/modeller

== 1.2) Run the script ==

Login to gimel2.

./0001-prepare-protein-CG-membrane.sh

The script reads rec.pdb and copies all other required files from

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/gromacs

=== Workflow ===

'''a.''' Generate CHARMM36m force field parameters of your protein in a Gromacs-readable format.

Used tool: gmx pdb2gmx

Output files are stored in the generated ''pdb2gmx'' directory

-conf.gro / conf.pdb - Gromacs coordinate file

-topol.top / Protein-atomistic.itp - Gromacs topology file, i.e. force field description of your input structure

-posre.itp - Position restraints for heavy atoms of atomistic protein strucutre.

'''b.''' Build coarse-grained structure

Used tool: martinize.py

Output files are stored in the generated ''martini'' directory.

-chain_.ssd - Output from the DSSP program that is called by martinize.py

-prot-cg.pdb - Coarse-grained protein structure

-prot-cg.top - Coarse-grained Martini topology of system

-Protein.itp - Coarse-grained Martini description of Protein structure

-prot-rot.pdb - Coarse-grained protein structure aligned along z-axis of the simulation box according to the proteins first principal component axis. This ensures the correct placement of the protein during membrane preparation. '''You may have to adjust the orientation of your input structure prior to membrane modeling.'''

'''c.''' Build coarse-grained membrane

Used tool: insane.py

Here, a lipid bilayer will be created around the protein structure (in the x/y-plane) and water will be added to the system. The default box shape is rectangular and the size is set to x,y=10nm, z=11nm. This can be changed in the ./insane.py command line. The default lipid type is POPC, you can change that to arbitrary lipid compositions using the -l and -u options of of insane.py.

Output files are stored in the ''martini'' directory.

-out.top / topol-cg.top - Topology of coarse-grained system

-cg-membrane.gro/.pdb - Coarse-grained system coordinates. '''Carefully inspect and visualize the cg-membrane.pdb.'''

Use PyMOL to check if you're protein is embedded correctly in the lipid bilayer.

= 2) Simulating the coarse-grained system =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0002-run-CG-Minimization-and-MD.sh

== 2.1) Run the script ==

Copy the script above to your working directory.

./0002-run-CG-Minimization-and-MD.sh

=== Workflow ===

'''a.''' Minimize coarse-grained system

Used tools: gmx grompp , gmx mdrun

gmx grompp generates a single .tpr file that contains all information necessary for running a MD simulation or minimization using gmx mdrun.

Minimization parameters are provided in martini_new-rf_min.mdp. The system will be minimized in 500 steps using steepest descent. The protein structure will be frozen during minimization.

Output files are stored in the ''martini'' directory.

-min.tpr - MD run input file

-min.log - Output log file from minimization

-min.trr - Minimization trajectory

-min.gro - Minimized system coordinates

'''b.''' Simulate coarse-grained system

MD simulation parameters are provided in martini_v2.x_new-rf.mdp. Strong position restraints are applied on the protein structure during the simulation. The system will be simulated for 50ns.

Output files are stored in the ''martini'' directory.

-md.tpr - MD run input file

-md.log - Output log file from simulation

-md.trr - lossless trajectory of simulation

-md.xtc - coordinates of simulation trajectory

-md.gro - coordinates of final simulation snapshot

'''The simulation will run for roughly 3 hours.'''

= 3) Converting coarse-grained system to atomistic resolution and select lipid atoms for grid generation =

This is automated in the script:

/mnt/nfs/home/stefan/zzz.scripts/INSERT-MEMBRANE/FILES/0003-backmap-and-lpd-selection.sh

== 3.1) Run the script ==

Copy the script above to your working directory.

./0003-backmap-and-lpd-selection.sh

=== Workflow ===

'''a.''' Backmapping from coarse-grained to atomistic

Used tool: gmx trjconv, initram.sh

gmx trjconv can perform a variety of conversions of MD trajectory, e.g. making molecules broken over the periodic boundary conditions whole again.

initram.sh calls the backward.py program which performs the backmapping of input coarse-grained to atomistic systems, and performs a small series of short minimizations and simulations to relax the backmapped system.

Output files are stored in the generated ''backmap'' directory.

-0-backmapped.gro / projected.gro - initial backmapped coordinates

-backmapped.top - Topology of atomistic system

-1-EM*/2-EM* - Output from minimizations

-3-mdpr*/4-mdpr*/5-mdpr*/6-mdpr* - Output from simulations

-backmapped.gro - Coordinates of final backmapped and relaxed system

'''This may need a few attempts to work all the way through. There is a while-loop that only stops until all relaxation steps have finished.'''

'''b.''' Replacing backmapped protein with initial atomistic protein structure

Used tool: PyMOL script align.pml

The pymol script will align the initial "Gromacs"-protein structure (conf.pdb) onto the backmapped structure and combine the fitted protein coordinates with the coordinates of the lipid and solvent environment.

Output files are stored in the generated ''backmap/prepare_AA_system'' directory.

-conf-fitted.pdb - Fitted initial protein structure

-backmapped-environment.pdb - All membrane and water coordinates

-fitted_system.pdb - Complete system containing fitted protein and environment coordinates

'''Be sure that fitted_system.pdb has the same number of coordinates as backmapped-mol.pdb.''' If there is a discrepancy there might be an issue with the PyMOL version you're using to run align.pml. Using PyMOL v2 or newer seems to avoid any issues. You can also generate backmapped-environment.pdb manually by taking all POPC and Water cooridnates from backmapped-mol.pdb.

'''c.''' Run minimizations of atomistic system

Used tools: gmx grompp, gmx mdrun

Output files are stored in the ''backmap/prepare_AA_system'' directory.

Tow minimizations will be calculated.

1) Minimization with frozen protein coordinates: 1,500 steps steepest descent (min_freeze.mdp).

-min_freeze* - Output files of first minimization

2) Minimization of full system: 500 steps (min.mdp).

-min* - Output files of second minimization

'''d.''' Select lipid atoms for DOCK grid generation

Used tools: PyMOL script prepare.pml

This will select carbon and hydrogen atoms of the hydrophobic lipid tail segments in a radius of 1.7 nm around the protein and assign them to the atom type "LPD"

You need to provide the rec.pdb to want to use for docking (potentially with missing loops) at that step as xtal-prot.pdb

Output files are stored in the generated ''backmap/prepare_AA_system/prepare_min'' directory.

-shell-LPD.pdb - all LPD atoms selected for grid generation

Add these coordinates to your docking protein structure and provide a amb.crg.oxt file adding

C lpd 0.000 LIPID SPHERE

Now you can run blastermaster.

= Membrane modelling in Schrodinger =
Written by Andrii Kyrylchuk, 2022/04/20

== MD of protein and membrane ==
Import structure '''without''' ligand, use Preparation wizard as described in "Code for Controls..." to model missing loops and capping.

Then open System Builder, click <code>Setup membrane</code>.

Go to the website https://opm.phar.umich.edu/proteins/ and find your protein. Copy residue numbers from the bottom of the page to the field <code>Transmembrane atoms...</code>. The format is as follows:

<code>res.num 76-97,112-136,141,...</code>

Click <code>Place Automatically</code>, <code>OK</code>. Then click <code>Run</code>. Examine lipids and solvent after run completes.

Then use Molecular Dynamics menu to set up the calculation. Select prepared system, click <code>Load</code> on top of the menu. Put simulation time of 5 ns, <code>Advanced Options</code> -- <code>Restraints</code> -- <code>Add.</code> Select protein, and put <code>Force Constant</code> of 100, click <code>Apply</code> and <code>OK.</code> Then click down arrow left of the <code>Run</code> button in the parent window and click <code>Write</code>.

'''''NOTE''': In my experience, restrained MD runs with NgPT in Schrodinger often fail with the following message:''

<code>''Allowed momentum exceeded on 17 particles.''</code>

''This seems to be related to the interference of restraints and the ensemble, as no such error is observed if no restraints are imposed, or if NVT ensemble is used. Still, NgPT ensemble is important for correct membrane sampling, because simulations at NVT long enough to permit membrane relaxation, lead to the smearing of the lipid bilayer and the formation of empty space.''

''If the mentioned problem occurs, import the last coordinates of the run (*-out.cms) into Maestro, open Minimization menu, load the structure from the workspace, and apply restraints on lipid and protein (force restraint of 10 is usually enough). Then run a minimization for 100 ps, and try to run NgPT MD starting from the optimized structure.''

'''''TO CHECK''': NPT ensemble.''

Copy the project folder (desmond_md_job_X) to gimel, login to gimel5, edit desmond_md_job_X.sh: delete <code>-lic DESMOND_GPGPU:16</code> and insert <code>-HOST gimel5.gpu</code> (or <code>gimel5.heavygpu</code>). Run .sh file, and your task will be submitted to a queue. For my system it took 1.5 hr to complete.

Download the project folder to your PC, open Maestro, click <code>Import structure</code> and open <code>-out.cms</code>, click on <code>T</code> icon at the new entry in project table and click <code>Display Trajectory Snapshots</code>. Select the last one, click <code>Display</code> and check if the protein did not change position during MD run, then click <code>Export</code>, to Project Table, Frames Selected only. You will get a new entry in the Project Table. Export it to a <code>.pdb</code> file.

== Preparation of the structure for Blastermaster ==
Use <code>prepare.pml</code> script. You need to provide the <code>rec.pdb</code> to want to use for docking (potentially with missing loops) at that step as <code>xtal-prot.pdb</code>. Rename MD system as <code>last-mol.pdb</code>.

<code>pymol -qc last-mol.pdb xtal-prot.pdb prepare.pml</code>

<syntaxhighlight lang="python">
align last-mol, xtal-prot

remove ////SPC

create MEM, ////POPC

# these atom numbers do not exist in POPC or DPPC
# therefore, we do not remove protons from the lipid
# structure to make more spheres. Uncomment these
# lines and change to proper H numbers if needed.
#remove /MEM////HS
#remove /MEM////HX
#remove /MEM////HY
#remove /MEM////H*B
#remove /MEM////H*A
#remove /MEM////H*C

remove /MEM////C1
remove /MEM////C2
remove /MEM////C3
remove /MEM////C11
remove /MEM////C12
remove /MEM////C13
remove /MEM////C14
remove /MEM////C15

create MEM_C, /MEM////*C* | /MEM////H*R | /MEM////H*S | /MEM////H*T | /MEM////H*X | /MEM////H*Y | /MEM////H*Z

create shell, MEM_C within 20 of xtal-prot

set retain_order, 1

save shell.pdb, shell

alter shell, resn='LPD'
alter shell, name='C'
alter shell, chain=''

rebuild

save shell-LPD.pdb, shell
</syntaxhighlight>

== Preparation of grids with thinspheres ==
Prepare grids with thinspheres for the protein without lipid as described in https://wiki.docking.org/index.php/How_to_do_parameter_scanning

Create an empty directory and put <code>shell-LPD.pdb</code> there. Then run the following script:

<code>sh blast-membrane-thinsph-scan.sh {path to the collection of "es_ld_thin_sph_rad_X.X" directories} {path to the dir with original working and dockfiles directories}</code>

This script runs qnifft and solvmap for each <code>es_ld_thin_sph_rad_X.X</code> directory, and then uses the second script of parameter scanning protocol to combine files into <code>dockfiles</code> directories.

'''Important! LOOK AT YOUR GRIDS! Desolvation -- larger solvation where water is. Electrostatics -- no electrostatics in the lipid region. vdW -- no vdW in the lipid region.'''

<syntaxhighlight lang="bash">
#!/bin/bash
# a script to prepare a protein with the lipid membrane for DOCK with thinsphere scan
# PREREQ -- run first step of https://wiki.docking.org/index.php/How_to_do_parameter_scanning (new_0001_generate_ES_LD_generation.py )
# first argument -- path to the directory where dirs "es_ld_thin_sph_rad_X.X" are stored.
# second argument -- path to the dir with original working and dockfiles
# run in a new directory with shell-LPD.pdb

run_once () {
# mkdir add_membrane
# Run qnifft for electrostatic grids
# cp $blastfiles_membrane/shell-LPD.pdb
# delete all fields except HETATM from shell-LPD.pdb
cp $curr_dir/shell-LPD.pdb .
sed -ir '/^CRYST1/d' shell-LPD.pdb
sed -ir '/^CONECT/d' shell-LPD.pdb
sed -ir '/^END/d' shell-LPD.pdb
cp $blastermaster_Prot/working/rec.crg.pdb .
#cp $blastermaster_Prot/working/lowdielectric.sph.pdb # (if wanted, if you run a parameter scan, be sure to use the correct spheres!)
#cp -r $blastermaster_Prot/dockfiles .
#cp -r $blastermaster_Prot/INDOCK .
mv rec.crg.pdb rec.crg.solo.pdb
cat rec.crg.solo.pdb shell-LPD.pdb > receptor.crg.lowdielectric.pdb
# extracting SPH lines from receptor
number=$(grep -n SPH $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb | head -n 1 | awk -F ":" '{print $1}')
tail --lines=+$number $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb >> receptor.crg.lowdielectric.pdb
need_files="amb.crg.oxt
qnifft.parm
vdw.siz"
for file in $need_files
do
if [ -e $blastermaster_Prot/working/$file ]
then
cp $blastermaster_Prot/working/$file .
else
cp $blast_orig/working/$file .
fi
done
echo "C lpd 0.000 LIPID SPHERE" >> amb.crg.oxt
cp $blastermaster_Prot/working/box .
cp $blastermaster_Prot/working/qnifft.parm .
cp $blastermaster_Prot/working/vdw.siz .
$DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm >& qnifft.log
$DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi >& trim.log

#cp trim.electrostatics.phi dockfiles/.

# Check if the grid size changed:
#echo "Check if the grid size changed, compare this with INDOCK"
#python ~jklyu/zzz.script/pymol_movie/phi_to_dx.py trim.electrostatics.phi trim.electrostatics.phi.dx > gridsize
#head -n 1 gridsize
# compare with delphi_nsize in INDOCK

# Run solvmap for Ligand Desolvation Grids if they are present in the folder
if [ -e $blastermaster_Prot/working/heavy ]
then
mkdir heavy
cd heavy || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/heavy/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
#cp ligand.desolv.heavy ../dockfiles/.
cd ../ || exit

mkdir hydrogen
cd hydrogen || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/hydrogen/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
#cp ligand.desolv.heavy ../dockfiles/.
fi
echo $dir " DONE!"
}

curr_dir=$(pwd)
workdirs=$1
blast_orig=$2
dirs=$(ls -d $workdirs/es_ld_thin_sph_rad_*)
for dir in $dirs
do
blastermaster_Prot=$dir
local_dir=$(echo $dir | awk -F"\/" '{print $NF}')
mkdir -p $local_dir/working
cd $local_dir/working || exit
run_once
cd $curr_dir || exit
done

python ~rstein/zzz.scripts/DOCK_prep_scripts/new_0002_combine_es_ld_grids_into_combos.py -p $blast_orig

</syntaxhighlight>

== Running blastermaster with default parameters ==
'''Warning! Do not use these grids, as the default grids with lipid spheres give incorrect solvation energies. Use the ones with thinspheres instead!'''
<syntaxhighlight lang="bash">
#!/bin/bash
# a script to prepare a protein with the lipid membrane for DOCK
# first argument -- path to blastermaster files of the protein without membrane
# run in a new directory with shell-LPD.pdb
blastermaster_Prot=$1
# mkdir add_membrane
# Run qnifft for electrostatic grids
# cp $blastfiles_membrane/shell-LPD.pdb
# delete all fields except HETATM from shell-LPD.pdb
sed -ir '/^CRYST1/d' shell-LPD.pdb
sed -ir '/^CONECT/d' shell-LPD.pdb
sed -ir '/^END/d' shell-LPD.pdb
cp $blastermaster_Prot/working/rec.crg.pdb .
#cp $blastermaster_Prot/working/lowdielectric.sph.pdb # (if wanted, if you run a parameter scan, be sure to use the correct spheres!)
cp -r $blastermaster_Prot/dockfiles .
cp -r $blastermaster_Prot/INDOCK .
mv rec.crg.pdb rec.crg.solo.pdb
cat rec.crg.solo.pdb shell-LPD.pdb > receptor.crg.lowdielectric.pdb
# extracting SPH lines from receptor
number=$(grep -n SPH $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb | head -n 1 | awk -F ":" '{print $1}')
tail --lines=+$number $blastermaster_Prot/working/receptor.crg.lowdielectric.pdb >> rec.crg.lowdielectric.pdb
cp $blastermaster_Prot/working/amb.crg.oxt .
echo "C lpd 0.000 LIPID SPHERE" >> amb.crg.oxt
cp $blastermaster_Prot/working/box .
cp $blastermaster_Prot/working/qnifft.parm .
cp $blastermaster_Prot/working/vdw.siz .
$DOCKBASE/proteins/qnifft/bin/qnifft22_193_pgf_32 qnifft.parm >& qnifft.log
$DOCKBASE/proteins/blastermaster/phiTrim.py qnifft.electrostatics.phi box trim.electrostatics.phi >& trim.log

cp trim.electrostatics.phi dockfiles/.

# Check if the grid size changed:
echo "Check if the grid size changed, compare this with INDOCK"
python ~jklyu/zzz.script/pymol_movie/phi_to_dx.py trim.electrostatics.phi trim.electrostatics.phi.dx > gridsize
head -n 1 gridsize
# compare with delphi_nsize in INDOCK

# Run solvmap for Ligand Desolvation Grids
mkdir heavy
cd heavy || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/heavy/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
cp ligand.desolv.heavy ../dockfiles/.
cd ../ || exit

mkdir hydrogen
cd hydroger || exit
cp ../receptor.crg.lowdielectric.pdb rec.crg.lds.pdb # (with thin spheres [if needed] and membrane)
cp ../box .
cp $blastermaster_Prot/working/hydrogen/INSEV .
$DOCKBASE/proteins/solvmap/bin/solvmap >& solvmap.log
cp ligand.desolv.heavy ../dockfiles/.
echo "DONE!"
</syntaxhighlight>

== Lipid membrane models from MemProtMD ==
If a protein-membrane complex was already modeled for your system and deposited at [http://memprotmd.bioch.ox.ac.uk/home/ MemProtMD] website, you can use it and skip doing MD in Schrodinger. The steps are very similar to the ones after Schrodinger run.

Download the <code>*_default_dppc.mpmd.finalframe.atomistic.pdb</code> file from the bottom of the page. Rename it to <code>last-mol.pdb</code>.

Use prepare.pml script (below). You need to provide the <code>rec.pdb</code> to want to use for docking (potentially with missing loops) at that step as <code>xtal-prot.pdb</code>.

<code>pymol -qc last-mol.pdb xtal-prot.pdb prepare.pml</code>

The script below differs from the one for processing Schrodinger results in two points: solvent residue is <code>SOL</code> instead of <code>SPC</code>, and lipids are called <code>DPPC</code> instead of <code>POPC</code>.<syntaxhighlight lang="python">
align last-mol, xtal-prot

remove ////SOL

create MEM, ////DPPC

remove /MEM////HS
remove /MEM////HX
remove /MEM////HY
remove /MEM////H*B
remove /MEM////H*A
remove /MEM////H*C

remove /MEM////C1
remove /MEM////C2
remove /MEM////C3
remove /MEM////C11
remove /MEM////C12
remove /MEM////C13
remove /MEM////C14
remove /MEM////C15

create MEM_C, /MEM////*C* | /MEM////H*R | /MEM////H*S | /MEM////H*T | /MEM////H*X | /MEM////H*Y | /MEM////H*Z

create shell, MEM_C within 20 of xtal-prot

set retain_order, 1

save shell.pdb, shell

alter shell, resn='LPD'
alter shell, name='C'
alter shell, chain=''

rebuild

save shell-LPD.pdb, shell
</syntaxhighlight>

Conversion of .rxn files to reaction SMARTS

2022-06-06T21:14:39Z

Iamkaant: Created page with "DataWarrior folks have implemented enumeration protocol in their soft. It is published in 10.1021/acs.jcim.1c01041 They store reactions in .rxn format, which is documented in..."

DataWarrior folks have implemented enumeration protocol in their soft. It is published in 10.1021/acs.jcim.1c01041

They store reactions in .rxn format, which is documented in https://docs.chemaxon.com/display/docs/mdl-molfiles-rgfiles-sdfiles-rxnfiles-rdfiles-formats.md. The files can be found in "reactions" directory in https://github.com/joewah/Virtual-Fragment-Spaces

The .rxn files can be opened in ChemAxon MarvinSketch. Then just select the reaction, right click, Copy as -- Daylight SMARTS. And you get the correct SMARTS with all atom lists. For example, for amidation reaction I got

[#7,#8]-[#7H1:1](-[#6:2]=[#7,#8,#16])-[#6,#16]=[#7,#8,#16].[#6:3]-[#6:4](-[#8,#17;D1])=[O:5]>>[#6:2]-[#7:1]-[#6:4](-[#6:3])=[O:5]