ZINC Novelty Score: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
(asdf) |
||
(One intermediate revision by the same user not shown) | |||
Line 3: | Line 3: | ||
ZNS = 1.0 - (Tc(ecpf4) + Tc(path))/2 * 100 % | ZNS = 1.0 - (Tc(ecpf4) + Tc(path))/2 * 100 % | ||
where Tc is the Tanimoto coefficient of the most similar molecule in ZINC, using either ECFP4 or Path-based fingerprints, as implemented in | where Tc is the Tanimoto coefficient of the most similar molecule in ZINC, using either ECFP4 or Path-based fingerprints, as implemented in [[RDKit]]. | ||
Thus molecules that are in ZINC have Tc of 1.0, and a ZNS of 0%. Molecules that are related but different to molecules in ZINC will have small ZNS scores, and molecules will approach novelty when they have no features in common with any molecules in ZINC. | Thus molecules that are in ZINC have Tc of 1.0, and a ZNS of 0%. Molecules that are related but different to molecules in ZINC will have small ZNS scores, and molecules will approach novelty when they have no features in common with any molecules in ZINC. | ||
Line 20: | Line 20: | ||
[[Category:ZINC]] | [[Category:ZINC]] |
Latest revision as of 04:20, 1 October 2015
The ZINC Novelty Score (ZNS) is a statistic to express how unusual a molecule is compared to what is in ZINC. It is calculated automatically following a ZINC search in the new interface. The score is calculated as follows:
ZNS = 1.0 - (Tc(ecpf4) + Tc(path))/2 * 100 %
where Tc is the Tanimoto coefficient of the most similar molecule in ZINC, using either ECFP4 or Path-based fingerprints, as implemented in RDKit.
Thus molecules that are in ZINC have Tc of 1.0, and a ZNS of 0%. Molecules that are related but different to molecules in ZINC will have small ZNS scores, and molecules will approach novelty when they have no features in common with any molecules in ZINC.
There are three variants:
- ZNS(target) : The novelty of the compound with respect to known (annotated) compounds for that target.
- ZNS(target-pattern) : The novelty of the compound with respect to known (annotated) compounds matching a particular target pattern.
- ZNS(*) : A special case of the above, this statistics says: how novel is the compound compared to any compound with any ChEMBL annotation (10uM or better)
- ZNS(): Novelty compared to all molecules in ZINC, whether they are commercially available or not.
- ZPNS(): Commercially available novelty. How novel is this compound compared to what is on the market, as reflected in ZINC. Thus if a molecule is commercially available, then its ZPNS() or ZINC Purchasable Novelty Score is 0%. A compound that is known, and even that has been for sale in the past, may still have a high ZPNS if nothing like it is currently on the market, as reflected in ZINC.