THC

From DISI
Revision as of 23:40, 7 September 2010 by Frodo (talk | contribs)
Jump to navigation Jump to search

The top hits collection (THC) links indications, targets, and molecules. It contains information ranging from experimentally confirmed to speculative. The goal is to provide biologists with suggestions of small molecules that may be useful to modulate biology. THC is a community curated database. You are invited to contribute. Please see THC:community participation to learn more.

How to Use

THC is designed to be simple. You may search using indications, targets or molecules. You can display linked indications, targets, or molecules. You may also simply browse the database, or pick molecules at random.

(more detail needed here)

Sources of Molecule-Target Assocations

THC condenses information from a variety of sources. By default, it presents results in decreasing order of confidence. Thus compounds with measured Ki's would be followed by compounds from patents (where a Ki may not have been measured or reported), followed by analogs (which may not, in fact, bind), followed finally by predictions. THC features two different kinds of predictions, and more may be added in future. The first is SEA, the Similarity Ensemble Approach, a chemical informatics method to predict targets for molecules based on their statistical resemblence to precedented bioactives. The second source of predictions is DOCK Blaster, the automatic molecular docking pipeline. Here, each prediction is substantiated by a three dimensional model of the molecule in the binding site. The confidence in the prediction may be assessed by docking of precedented bioactives, when available. We take the data sources in turn.

ChEMBL

This is the premier source of freely available medicinal chemistry information. We are currently using ChEMBL 05 (August 2010).

Drugbank.ca

This is another freely available source of information. It has molecule-target and molecule-indication mappings, and thus implicitly target-indication mappings.

bindingdb.org

completely subsumed in ChEMBL? not quite...

FDA Orange Book

EPA MDD is the SDF file we used. Contains mappings between drugs and indications.

PubChem Assay

On the one hand, there is a lot of data in here. On the other hand, the data are very noisy, perhaps so noisy as to be useless. Still, we can tell our user about this data, and let him or her decide about it, by linking back to the original source.

SEA

Formerly called WINC, these are predictions made by SEA against all purchasable molecules (ZINC) based on publicly available data (ChEMBL).

DOCK Blaster

DOCK Blaster predicts small molecule binding to protein targets. The protein models are drawn from the PDB, and later from ChEMBL. Crucially, DOCK Blaster assesses the quality of the docking based on retrospective analysis, when ligands are available. When no control ligands are available, DOCK Blaster will attempt to assess docking reliablity in other ways.


3rd party user annotations

You may register at this site and upload your own data to THC. You may also comment on data that have been uploaded by other users. If at all possible, please support your data with links to publications (PMID #) or upload relevant spectra or other data to this wiki.

Sources of Indication-Target Assocations

TTD

Sources of Indication-Molecule Assocations

FDA Orange Book

Drugbank