Latest revision as of 18:39, 14 March 2022

Written by Jennifer Young on April 14, 2020. Updated by Khanh Tang on March 14, 2022

ZINC21-Tools

These scripts perform fine tranching with RDKit to compute the heavy atom count and logP for each molecule and put it in a bucket of the form HxxPyyy for positive valued logp (i.e. 0 < logp) and HxxMyyy for negative valued logp (i.e. logp < 0).

See github repo https://github.com/docking-org/ZINC21-Tools

How to run

(If you are using our cluster) Source conda environment for RDKit

If you are using our cluster, there is already a conda environment with RDKit available and you just need to source it using the following command. You need to use bash.

    bash

   source /mnt/nfs/home/devtest/anaconda3/bin/activate my-rdkit-env

If you need to create a conda environment, follow the instructions at https://rdkit.org/docs/Install.html

Read the section : How to install RDKit with Conda. Once you do

   conda activate my-rdkit-env

   conda install -c conda-forge tqdm

You are ready to run the Python script.

Run Python script with the desired arguments

The input smiles file should have the following 2 columns

smiles
ID

   python rdkit_hlogp_batch_mp_2.py <smiles>

The output file will be a file with the name <smiles_file>_hlogp and will have the following 3 columns

original smiles
original ID
HxxPyyy HxxMyyy

@@ Line 1: / Line 1: @@
-Written by Jennifer Young on April 14, 2020
+Written by Jennifer Young on April 14, 2020. Updated by Khanh Tang on March 14, 2022
-=Introduction=
+=Introduction https://github.com/docking-org/ZINC21-Tools=
-These scripts perform fine tranching with RDKit to compute the heavy atom count and logP for each molecule and put it in a bucket of the form HxxPyyy for positive valued logp (i.e. 0 < logp) and HxxMyyy for negative valued logp (i.e. logp < 0).  The scripts are located in
+These scripts perform fine tranching with RDKit to compute the heavy atom count and logP for each molecule and put it in a bucket of the form HxxPyyy for positive valued logp (i.e. 0 < logp) and HxxMyyy for negative valued logp (i.e. logp < 0).
-    /nfs/home/jyoung/code/fine_tranche_hlogp_scripts
+See github repo https://github.com/docking-org/ZINC21-Tools
 =How to run=
-==Create or source conda environment for RDKit==
+==(If you are using our cluster) Source conda environment for RDKit==
 If you are using our cluster, there is already a conda environment with RDKit available and you just need to source it using the following command.  You need to use bash.
       bash
@@ Line 13: / Line 14: @@
      source /mnt/nfs/home/devtest/anaconda3/bin/activate my-rdkit-env
-If you need to create a conda environment, follow the instructions at https://rdkit.org/docs/Install.html
+==If you need to create a conda environment, follow the instructions at https://rdkit.org/docs/Install.html==
 Read the section : How to install RDKit with Conda.  Once you do
      conda activate my-rdkit-env
+    conda install -c conda-forge tqdm
 You are ready to run the Python script.
 ==Run Python script with the desired arguments==
-The smiles file and batch size are command line arguments.  If you choose a batch size of 10,000, the output file will be written to after each batch of 10,000 molecules is processed.
+The input smiles file should have the following 2 columns
-     python /nfs/home/jyoung/code/fine_tranche_hlogp_scripts/rdkit_hlogp_batch.py <smiles_file> <batch_size>
+*smiles
+*ID
-==Sample Bash script for running on many smiles files==
+     python rdkit_hlogp_batch_mp_2.py <smiles>
-If your smiles file is large, split into chunks of 1 million (or whatever your desired size).
-    split -l 1000000 <your_smiles>
-Then run the following script which is reproduced below.
-    /nfs/home/jyoung/code/fine_tranche_hlogp_scripts/runall.sh
-Change the x?? to the desired pattern and change the batch size to the desired value.
-    #!/usr/bin/env bash
+The output file will be a file with the name <smiles_file>_hlogp and will have the following 3 columns
-    for i in x??;
+* original smiles
-    do
+* original ID
-       source /mnt/nfs/home/devtest/anaconda3/bin/activate my-rdkit-env
+* HxxPyyy HxxMyyy
-       python /nfs/home/jyoung/code/fine_tranche_hlogp_scripts/rdkit_hlogp_batch.py $i 10000
+[[ Category:ZINC22 ]]
-    done

ZINC22:Fine Tranching with RDKit using Heavy Atom Count and LogP: Difference between revisions

Latest revision as of 18:39, 14 March 2022

Contents

Introduction https://github.com/docking-org/ZINC21-Tools

How to run

(If you are using our cluster) Source conda environment for RDKit

If you need to create a conda environment, follow the instructions at https://rdkit.org/docs/Install.html

Run Python script with the desired arguments

Navigation menu

ZINC22:Fine Tranching with RDKit using Heavy Atom Count and LogP: Difference between revisions

Latest revision as of 18:39, 14 March 2022

Introduction https://github.com/docking-org/ZINC21-Tools

How to run

(If you are using our cluster) Source conda environment for RDKit

If you need to create a conda environment, follow the instructions at https://rdkit.org/docs/Install.html

Run Python script with the desired arguments

Navigation menu

Search