Strain Filtering: Difference between revisions

From DISI
Jump to navigation Jump to search
(Created page with "This is Strain Filtering version 1.1. Please copy the code to your current directory. $ cp -r /mnt/nfs/home/sgu/code/ucsf . To run the code, you need RDKit. You can follow t...")
 
No edit summary
(12 intermediate revisions by the same user not shown)
Line 1: Line 1:
This is Strain Filtering version 1.1. Please copy the code to your current directory.
This is Strain Filtering version 1.1 (20200218). Please copy the code to your current directory.
  $ cp -r /mnt/nfs/home/sgu/code/ucsf .
  $ cp -r /mnt/nfs/home/sgu/code/strainfilter .


To run the code, you need RDKit. You can follow the instruction to install RDKit: https://www.rdkit.org/docs/Install.html
Furthermore, if you don't want to include the strain from hydrogens, you can try this version:
  $ conda create -c rdkit -n my-rdkit-env rdkit
$ cp -r /mnt/nfs/home/sgu/code/noh .
 
 
To run the code, you need to install RDKit by following the instruction: https://www.rdkit.org/docs/Install.html
 
On our cluster, you may source my environment.
  $ source /nfs/home/sgu/anaconda3/etc/profile.d/conda.csh
  $ conda activate my-rdkit-env
  $ conda activate my-rdkit-env


Please use python3 to run the code. Currently, the code can handle db2/db2.gz and mol2 inputs. For example:
Currently, the code can handle db2/db2.gz and mol2 inputs. For example:
  $ python3 Torsion_Strain.py test1.db2.gz
  $ python Torsion_Strain.py test1.db2.gz
  $ python3 Torsion_Strain.py test2.mol2
  $ python Torsion_Strain.py test2.mol2


The output is a csv file, containing the total strain energy and detailed information of each dihedral sorted by its torsion energy.
The output is a csv file, containing the total strain energy and detailed information of each dihedral sorted by its torsion energy.


You may be interested in column 2 (total strain energy) and column 11 (the maximum dihedral torsion energy), from which you can choose different levels to filter compounds, e.g.
You may be interested in '''Column 2 (total strain energy) and Column 6 (the maximum dihedral torsion energy)''', from which you can choose different thresholds to filter compounds. e.g.
  $ csvcut -c 1,2,11 test2_Torsion_Strain.csv | awk -F"," '$2<6.5&&$3<1.8 {print $1" "$2" "$3}'
  $ awk -F"," '$2>0 && $2<6.5 && $6<1.8' test2_Torsion_Strain.csv > filtered.csv
This example uses csvkit and awk to output all the compounds with total strain energy <6.5 and every torsion energy <1.8.
This example uses awk to output all the compounds with total strain energy <6.5 and every torsion energy <1.8.

Revision as of 21:55, 17 June 2020

This is Strain Filtering version 1.1 (20200218). Please copy the code to your current directory.

$ cp -r /mnt/nfs/home/sgu/code/strainfilter .

Furthermore, if you don't want to include the strain from hydrogens, you can try this version:

$ cp -r /mnt/nfs/home/sgu/code/noh .


To run the code, you need to install RDKit by following the instruction: https://www.rdkit.org/docs/Install.html

On our cluster, you may source my environment.

$ source /nfs/home/sgu/anaconda3/etc/profile.d/conda.csh
$ conda activate my-rdkit-env

Currently, the code can handle db2/db2.gz and mol2 inputs. For example:

$ python Torsion_Strain.py test1.db2.gz
$ python Torsion_Strain.py test2.mol2

The output is a csv file, containing the total strain energy and detailed information of each dihedral sorted by its torsion energy.

You may be interested in Column 2 (total strain energy) and Column 6 (the maximum dihedral torsion energy), from which you can choose different thresholds to filter compounds. e.g.

$ awk -F"," '$2>0 && $2<6.5 && $6<1.8' test2_Torsion_Strain.csv > filtered.csv

This example uses awk to output all the compounds with total strain energy <6.5 and every torsion energy <1.8.