Interactive ligands visualizer: Difference between revisions

From DISI
Jump to navigation Jump to search
(Created page with "I (Olivier) put together this interactive visualizer to make sure that I don't miss out some chemotypes when coming up with actives at the start of a retrospective campaign. Starting from a downloaded ChEMBL CSV file for a list of ligands, images of each molecule are generated with RDKit and a text file with filtered Smiles is generated. You then need to compute the ECFP fingerprints on Gimel from that file (see below), and then a generated script will show an interactiv...")
 
No edit summary
Line 1: Line 1:
I (Olivier) put together this interactive visualizer to make sure that I don't miss out some chemotypes when coming up with actives at the start of a retrospective campaign. Starting from a downloaded ChEMBL CSV file for a list of ligands, images of each molecule are generated with RDKit and a text file with filtered Smiles is generated. You then need to compute the ECFP fingerprints on Gimel from that file (see below), and then a generated script will show an interactive visualization of the chemical space spanned by the ligands (tSNE), with each molecule shown on mouse hovering.
I (Olivier) put together this interactive visualizer to make sure that I don't miss out some chemotypes when coming up with actives at the start of a retrospective campaign. Starting from a downloaded ChEMBL CSV file for a list of ligands, images of each molecule are generated with RDKit and a text file with filtered Smiles is generated. You then need to compute the ECFP fingerprints on Gimel from that file (see below), and then a generated script will show an interactive visualization of the chemical space spanned by the ligands (tSNE), with each molecule shown on mouse hovering.
[[File:Chemspace_vis_example1.png]]





Revision as of 00:43, 20 January 2023

I (Olivier) put together this interactive visualizer to make sure that I don't miss out some chemotypes when coming up with actives at the start of a retrospective campaign. Starting from a downloaded ChEMBL CSV file for a list of ligands, images of each molecule are generated with RDKit and a text file with filtered Smiles is generated. You then need to compute the ECFP fingerprints on Gimel from that file (see below), and then a generated script will show an interactive visualization of the chemical space spanned by the ligands (tSNE), with each molecule shown on mouse hovering.

Chemspace vis example1.png


Step 1: install chemspace_vis package

Make sure you are using Python 3, and then simply:

pip install chemspace_vis

N.B. This only works on Mac and Linux, sorry Windows users (if you exist).


Step 2: obtain ChEMBL CSV file (or use provided example)

Any ChEMBL CSV from a given activity of a given target will do.

You can also clone the example repository, which contains the CSV for .... and an example script:

git clone https://github.com/gregorpatof/chemspace_vis_example


Step 3: extract Smiles and activity for given HAC and MW filters

This is accomplished by the preprocess_part1() method in the example script, which runs a single command:

from chemspace_vis.preprocess import preprocess_chembl

chembl_csv = "c5a_ic50_chembl.csv"

activity_name = "IC50" # The text name of the activity (in this case, IC50)
preprocess_chembl(chembl_csv, activity_name, max_hac=35, max_mw=600, img_folder="mol_images")