ZINC Biogenic Libraries: Difference between revisions

From DISI
Jump to navigation Jump to search
(Created page with "The ZINC Natural Product Like library is a collection of commercially available compounds that are natural products, strongly ressemble natural products, or have signifcant su...")
 
No edit summary
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
The ZINC Natural Product Like library is a collection of commercially available compounds that are natural products, strongly ressemble natural products, or have signifcant substructures that are like natural products.
Biogenic and Biogenic-like libraries in ZINC.
 
We have created screening libraries based on molecules of biological origin.  To be clear, we include both primary metabolites - often just called metabolites - as well as secondary metabolites - often called natural products - in our database of biogenic molecules.  These libraries are inspired by the argument in [http://zinc.docking.org/browse/subsets/special Hert et al NCB 2008], we then find all compounds that are similar to these biogenic molecules for the biogenic-like libraries. We are also inspired by the Dortmund and Broad/Harvard groups working in the areas of natural products and nature-inspired compounds.  


= Assembly =  
= Assembly =  
* 1. commercially available natural products (based on ZINC subset 98)
* 1. All biogenic compounds from public sources.  The purchasable version of this is subset 98. Zbc - ZINC Biogenic compounds.
* 2. Tanimoto 80% similarity to any NP (rdkit path based fingerprints, 2048 bits)
* 2. Tanimoto 80% similarity or better to any Biogenic compound, based on rdkit path-based fingerprints, 2048 bits.
* 3. Fragment NPs into 10+ atom fragments (Ertl method).  
* 3. We fragment Biogenic compounds into Murcko Scaffolds and ring systems (Ertl, via molinspiration. type 2 and 3 fragmentation). We retain only ring systems of 10 or more atoms and then accept compounds having Tanimoto 80% similarity or better (rdkit 2048 pathbased) to any Biogenic 10+ atom fragment thus calculated.
* 3a. Tanimoto 80% similarity to any NP-fragment. (rdkit path based fingerprints, 2048 bits)  
 
* 3b. NPs having strict substructure of 10+ atom fragment.  
= Results =
Subsets are organized into lead-like, fragment-like, drug-like, all, and shard-like subsets as usual, for both biogenic and biogenic like.  These are called Zbc - ZINC Biogenic compounds and Zni - ZINC Nature Inspired. We made these names deliberately different for clarity. Zbc compounds are produced by nature, and nature has been seeing them for evolutionary time. Zni - nature inspired - include both compounds from nature and synthetic compounds that look natural, when you have your Tanimoto 80% glasses on.


= Inspiration =
= Inspiration =
inspired by the work of Hert et la.
Hert, Dortmund Group, Broad/Harvard Group.
Also Dortmund Group.  
 
Also Reses paper.
= Argument =
The argument is that nature-like biased screening libraries should provide far richer and denser hits that one would expect by screening synthetic compounds alone.  Indeed, the only reason HTS and virtual screening work as well as they do is that they are already heavily biased towards biogenic like molecules.  These may also be good for protein function identification via docking, because if a site recognizes a molecule from nature, then perhaps that or a similar one is the endogenous ligand for that site.


[[Category:ZINC]]
[[Category:ZINC]]
[[Category:Natural products]]
[[Category:Databases]]

Latest revision as of 15:28, 11 March 2014

Biogenic and Biogenic-like libraries in ZINC.

We have created screening libraries based on molecules of biological origin. To be clear, we include both primary metabolites - often just called metabolites - as well as secondary metabolites - often called natural products - in our database of biogenic molecules. These libraries are inspired by the argument in Hert et al NCB 2008, we then find all compounds that are similar to these biogenic molecules for the biogenic-like libraries. We are also inspired by the Dortmund and Broad/Harvard groups working in the areas of natural products and nature-inspired compounds.

Assembly

  • 1. All biogenic compounds from public sources. The purchasable version of this is subset 98. Zbc - ZINC Biogenic compounds.
  • 2. Tanimoto 80% similarity or better to any Biogenic compound, based on rdkit path-based fingerprints, 2048 bits.
  • 3. We fragment Biogenic compounds into Murcko Scaffolds and ring systems (Ertl, via molinspiration. type 2 and 3 fragmentation). We retain only ring systems of 10 or more atoms and then accept compounds having Tanimoto 80% similarity or better (rdkit 2048 pathbased) to any Biogenic 10+ atom fragment thus calculated.

Results

Subsets are organized into lead-like, fragment-like, drug-like, all, and shard-like subsets as usual, for both biogenic and biogenic like. These are called Zbc - ZINC Biogenic compounds and Zni - ZINC Nature Inspired. We made these names deliberately different for clarity. Zbc compounds are produced by nature, and nature has been seeing them for evolutionary time. Zni - nature inspired - include both compounds from nature and synthetic compounds that look natural, when you have your Tanimoto 80% glasses on.

Inspiration

Hert, Dortmund Group, Broad/Harvard Group.

Argument

The argument is that nature-like biased screening libraries should provide far richer and denser hits that one would expect by screening synthetic compounds alone. Indeed, the only reason HTS and virtual screening work as well as they do is that they are already heavily biased towards biogenic like molecules. These may also be good for protein function identification via docking, because if a site recognizes a molecule from nature, then perhaps that or a similar one is the endogenous ligand for that site.