IFP Filtering on Wynton

From DISI
Revision as of 21:13, 8 October 2024 by Sevigneron (talk | contribs)
Jump to navigation Jump to search

Seth Vigneron Oct 2024

IFP filtering on wynton proceeds in almost the exact same way as on our gimel cluster. Note the conda environment sourced uses the same older versions of python and LUNA as JK's original scripts on gimel, so wynton is not using any updated versions of any software and still runs the same IFP protocol.

1. Run getposes script following your screen.

2. Make split mol2 files to be run in parallel

  ls $PWD/poses_extract_for_getposes_parallel_*mol2 > ifp_mol2_dirlist
  mkidr ifp
  cd ifp
  cp -r /wynton/group/bks/work/shared/svigneron/IFP_wynton_scrips/scripts .
  vim scripts/ifp_interactions.py 

add in your desired interaction filters to the filters list An example for formatting the residue name/number and interaction:

  filters = [['Hydrogen bond','GLY-333'],['Hydrogen bond','ALA-353'],['Hydrogen bond','TYR-368']]
  mkdir working
  cd working
  while read line; do python ../scripts/lc_blazing_fast_separate_mol2_into_smaller_files_called_filter-XXX.py $line 2000 ; done <../../ifp_mol2_dirlist
  ls *.mol2 > dirlist

cp /path/to/rec.crg.pdb . vim rec.crg.pdb Be sure to change any HIE or HID to HIS, and revert back any names for tarted residues as those will not be recognized by LUNA.

csh ../scripts/submit.csh /path/to/working /path/to/scripts <name of receptor.pdb without .pdb at the end>

3. Combine parallel IFP runs ls -d --color=never [0-9]* > dirlist_combine python ../scripts/check_finished_notfinished_ifp.py

  If there are any jobs that failed to run, they will be put into NOT-FINISHED_dirlist
      to re-run these: csh ../scripts/resubmit.csh /path/to/working /path/to/scripts <name of receptor.pdb without .pdb at the end> NOT-FINISHED_dirlist

python ../scripts/combine_ifp.py dirlist_combine combined

4. Collect Filtered Molecules the combined.interactions.csv file lists out each molecules interactions from

$1 : ZINC ID $2 : # of H-bond donors $3 : # of H-bond acceptors $4 : # of unsatisfied H-bond donors $5 : # of unsatisfied H-bond acceptors and starting with $6 and onwards are the additional interactions specified in ifp_interactions.py

Typical protocol for the lab is to remove any compound with unsatisfied hbond donors and more than 3 unsatisfied hbond acceptors awk -F "," '$4==0 && $5<=2 && $6==1' combined.interactions.csv > ifp_filtered.interactions.csv

 where $6 and so on are your additional filters

awk -F "," '{print $1}' ifp_filtered.interactions.csv > ifp_filtered.interactions.zincid