Revision as of 22:29, 17 April 2020

Written by Jennifer Young on April 14, 2020

Introduction

These scripts perform fine tranching with RDKit to compute the heavy atom count and logP for each molecule and put it in a bucket of the form HxxPyyy for positive valued logp (i.e. 0 < logp) and HxxMyyy for negative valued logp (i.e. logp < 0).

See github repo https://github.com/docking-org/ZINC21-Tools

How to run

(If you are using our cluster) Source conda environment for RDKit

If you are using our cluster, there is already a conda environment with RDKit available and you just need to source it using the following command. You need to use bash.

    bash

   source /mnt/nfs/home/devtest/anaconda3/bin/activate my-rdkit-env

If you need to create a conda environment, follow the instructions at https://rdkit.org/docs/Install.html

Read the section : How to install RDKit with Conda. Once you do

   conda activate my-rdkit-env

   conda install -c conda-forge tqdm

You are ready to run the Python script.

Run Python script with the desired arguments

The input smiles file should have the following 2 columns

smiles
ID

   python rdkit_hlogp_batch_mp.py <smiles>

The output file will be a file with the name <smiles_file>_hlogp and will have the following 3 columns

original smiles
original ID
HxxPyyy HxxMyyy

@@ Line 2: / Line 2: @@
 =Introduction=
-These scripts perform fine tranching with RDKit to compute the heavy atom count and logP for each molecule and put it in a bucket of the form HxxPyyy for positive valued logp (i.e. 0 < logp) and HxxMyyy for negative valued logp (i.e. logp < 0).  The scripts are located in
+These scripts perform fine tranching with RDKit to compute the heavy atom count and logP for each molecule and put it in a bucket of the form HxxPyyy for positive valued logp (i.e. 0 < logp) and HxxMyyy for negative valued logp (i.e. logp < 0).
-    /nfs/home/jyoung/code/fine_tranche_hlogp_scripts
+See github repo https://github.com/docking-org/ZINC21-Tools
 =How to run=
@@ Line 16: / Line 17: @@
 Read the section : How to install RDKit with Conda.  Once you do
      conda activate my-rdkit-env
+    conda install -c conda-forge tqdm
 You are ready to run the Python script.
 ==Run Python script with the desired arguments==
-The smiles file and batch size are command line arguments.  If you choose a batch size of 10,000, the output file will be written to after each batch of 10,000 molecules is processed.
 The input smiles file should have the following 2 columns
 *smiles
 *ID
-See python script http://wiki.docking.org/index.php/Rdkit_hlogp_batch.py for reference
+     python rdkit_hlogp_batch_mp.py <smiles>
-     python /nfs/home/jyoung/code/fine_tranche_hlogp_scripts/rdkit_hlogp_batch.py <smiles_file> <batch_size>
 The output file will be a file with the name <smiles_file>_hlogp and will have the following 3 columns
@@ Line 33: / Line 33: @@
 * original ID
 * HxxPyyy HxxMyyy
-=Sample Bash script for running on many smiles files=
-If your smiles file is large, split into chunks of 1 million (or whatever your desired size).
-    split -l 1000000 <your_smiles>
-Then run the following script which is reproduced below.
-    /nfs/home/jyoung/code/fine_tranche_hlogp_scripts/[[runall.sh]]
-Change the x?? to the desired pattern and change the batch size to the desired value.
-    #!/usr/bin/env bash
-    for i in x??;
-    do
-       source /mnt/nfs/home/devtest/anaconda3/bin/activate my-rdkit-env
-       python /nfs/home/jyoung/code/fine_tranche_hlogp_scripts/[[rdkit_hlogp_batch.py]] $i 10000
-    done

ZINC22:Fine Tranching with RDKit using Heavy Atom Count and LogP: Difference between revisions

Revision as of 22:29, 17 April 2020

Contents

Introduction

How to run

(If you are using our cluster) Source conda environment for RDKit

If you need to create a conda environment, follow the instructions at https://rdkit.org/docs/Install.html

Run Python script with the desired arguments

Navigation menu

ZINC22:Fine Tranching with RDKit using Heavy Atom Count and LogP: Difference between revisions

Revision as of 22:29, 17 April 2020

Introduction

How to run

(If you are using our cluster) Source conda environment for RDKit

If you need to create a conda environment, follow the instructions at https://rdkit.org/docs/Install.html

Run Python script with the desired arguments

Navigation menu

Search