UCSF Strain Filtering: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
(Undo revision 12260 by Shuogu (talk))
Line 1: Line 1:
This is Strain Filtering version 1.1. Please copy the code to your current directory.
This is the 1st version of UCSF strain filtering. Please copy the code to your current directory.
  $ cp -r /mnt/nfs/home/sgu/code/ucsf .
  $ cp -r /mnt/nfs/home/sgu/code/ucsf .


To run the code, you need RDKit. You can follow the instruction to install RDKit: https://www.rdkit.org/docs/Install.html
To run the code, you need RDKit. You can follow the instruction to install RDKit: https://www.rdkit.org/docs/Install.html
On our cluster, you only need to type:
  $ conda create -c rdkit -n my-rdkit-env rdkit
  $ conda create -c rdkit -n my-rdkit-env rdkit
  $ conda activate my-rdkit-env
  $ conda activate my-rdkit-env
Line 10: Line 11:
  $ python3 Torsion_Strain.py test2.mol2
  $ python3 Torsion_Strain.py test2.mol2


The output is a csv file, containing the total strain energy and detailed information of each dihedral sorted by its torsion energy.
The output is a csv file, containing the strain energy and detailed information for each compound in mol2 or each conformation in db2.
 
You may be most interested in the first two columns of the output, otherwise please refer to the detailed comments in the Torsion_Strain.py.
You may be interested in column 2 (total strain energy) and column 11 (the maximum dihedral torsion energy), from which you can choose different levels to filter compounds, e.g.
$ csvcut -c 1,2,11 test2_Torsion_Strain.csv | awk -F"," '$2<6.5&&$3<1.8 {print $1" "$2" "$3}'
This example uses csvkit and awk to output all the compounds with total strain energy <6.5 and every torsion energy <1.8.

Revision as of 04:37, 12 February 2020

This is the 1st version of UCSF strain filtering. Please copy the code to your current directory.

$ cp -r /mnt/nfs/home/sgu/code/ucsf .

To run the code, you need RDKit. You can follow the instruction to install RDKit: https://www.rdkit.org/docs/Install.html On our cluster, you only need to type:

$ conda create -c rdkit -n my-rdkit-env rdkit
$ conda activate my-rdkit-env

Please use python3 to run the code. Currently, the code can handle db2/db2.gz and mol2 inputs. For example:

$ python3 Torsion_Strain.py test1.db2.gz
$ python3 Torsion_Strain.py test2.mol2

The output is a csv file, containing the strain energy and detailed information for each compound in mol2 or each conformation in db2. You may be most interested in the first two columns of the output, otherwise please refer to the detailed comments in the Torsion_Strain.py.