Calculate NPR values & Generate Heatmap: Difference between revisions

From DISI
Jump to navigation Jump to search
Line 5: Line 5:
  (base)$ conda create -c rdkit --name npr-py3 rdkit
  (base)$ conda create -c rdkit --name npr-py3 rdkit
  (base)$ conda activate npr-py3
  (base)$ conda activate npr-py3
(npr-py3)$ conda update ipython
  # Install jupyter notebook  
  # Install jupyter notebook  
  (npr-py3)$ conda install -c conda-forge notebook
  (npr-py3)$ conda install -c conda-forge notebook

Revision as of 07:29, 5 November 2020

Calculate NPR

Setup Python environment

- Download Anaconda3 installer and install follow the instruction (https://www.anaconda.com/products/individual) - Create anaconda env and install packages

(base)$ conda create -c rdkit --name npr-py3 rdkit
(base)$ conda activate npr-py3
(npr-py3)$ conda update ipython
# Install jupyter notebook 
(npr-py3)$ conda install -c conda-forge notebook
# Install vaex - dataframe library for huge libraries
(npr-py3)$ conda install -c conda-forge vaex 

Run NPR calculation

Your smiles file should be in this format with no header: <smiles> <cid>

(npr-py3)$ python extra_newprops.py {smiles_file}

Notes:

- Failed and success molecules are output from this script.

- The calculation maybe slow. It is recommend that you chunk the file and run it on parallel.

Make Heatmap

Generate h5py binary file

(npr-py3)$ python py_csv2hdf5.py {output_smiles_file}

This script without output h5py that is then can be read by vaex library (it is useful for read huge library into dataframe)

Plot

# Run Jupyter-Notebook
(npr-py3)$ jupyter-notebook

From jupyter-notebook interface

- Select 'single_plot.ipynb'

- Change the path to h5py file and run the kernel