IFP Filtering on Wynton
Seth Vigneron Oct 2024
IFP filtering on wynton proceeds in almost the exact same way as on our gimel cluster. Note the conda environment sourced uses the same older versions of python and LUNA as JK's original scripts on gimel, so wynton is not using any updated versions of any software and still runs the same IFP protocol.
1. Run getposes script following your screen.
2. Make split mol2 files to be run in parallel ls $PWD/poses_extract_for_getposes_parallel_*mol2 > ifp_mol2_dirlist mkidr ifp cd ifp cp -r /wynton/group/bks/work/shared/svigneron/IFP_wynton_scrips/scripts . vim scripts/ifp_interactions.py add in your desired interaction filters to the filters list An example for formatting the residue name/number and interaction: filters = [['Hydrogen bond','GLY-333'],['Hydrogen bond','ALA-353'],['Hydrogen bond','TYR-368']]
mkdir working
cd working
while read line; do python ../scripts/lc_blazing_fast_separate_mol2_into_smaller_files_called_filter-XXX.py $line 2000 ; done <../../ifp_mol2_dirlist
ls *.mol2 > dirlist
cp /path/to/rec.crg.pdb . vim rec.crg.pdb Be sure to change any HIE or HID to HIS, and revert back any names for tarted residues as those will not be recognized by LUNA.
csh ../scripts/submit.csh /path/to/working /path/to/scripts <name of receptor.pdb without .pdb at the end>
3. Combine parallel IFP runs ls -d --color=never [0-9]* > dirlist_combine python ../scripts/check_finished_notfinished_ifp.py
If there are any jobs that failed to run, they will be put into NOT-FINISHED_dirlist to re-run these: csh ../scripts/resubmit.csh /path/to/working /path/to/scripts <name of receptor.pdb without .pdb at the end> NOT-FINISHED_dirlist
python ../scripts/combine_ifp.py dirlist_combine combined
4. Collect Filtered Molecules the combined.interactions.csv file lists out each molecules interactions from
$1 : ZINC ID $2 : # of H-bond donors $3 : # of H-bond acceptors $4 : # of unsatisfied H-bond donors $5 : # of unsatisfied H-bond acceptors and starting with $6 and onwards are the additional interactions specified in ifp_interactions.py
Typical protocol for the lab is to remove any compound with unsatisfied hbond donors and more than 3 unsatisfied hbond acceptors awk -F "," '$4==0 && $5<=2 && $6==1' combined.interactions.csv > ifp_filtered.interactions.csv
where $6 and so on are your additional filters
awk -F "," '{print $1}' ifp_filtered.interactions.csv > ifp_filtered.interactions.zincid