THC
The top hits collection (THC) links indications, targets, and molecules. It contains information ranging from experimentally confirmed to speculative. The goal is to provide biologists with suggestions of small molecules that may be useful to modulate biology. THC is a community curated database. You are invited to contribute. Please see THC:community participation to learn more.
Sources of Molecule-Target Assocations
THC condenses information from a variety of sources. By default, it presents results in decreasing order of confidence. Thus compounds with measured Ki's would be followed by compounds from patents (where a Ki may not have been measured or reported), followed by analogs (which may not, in fact, bind), followed finally by predictions. THC features two different kinds of predictions, and more may be added in future. The first is SEA, the Similarity Ensemble Approach, a chemical informatics method to predict targets for molecules based on their statistical resemblence to precedented bioactives. The second source of predictions is DOCK Blaster, the automatic molecular docking pipeline. Here, each prediction is substantiated by a three dimensional model of the molecule in the binding site. The confidence in the prediction may be assessed by docking of precedented bioactives, when available. We take the data sources in turn.
ChEMBL
This is the premier source of freely available medicinal chemistry information. We are currently using ChEMBL 05 (August 2010).
Drugbank.ca
This is another freely available source of information. It has molecule-target and molecule-indication mappings, and thus implicitly target-indication mappings.
bindingdb.org
completely subsumed in ChEMBL? not quite...
FDA Orange Book
EPA MDD is the SDF file we used. Contains mappings between drugs and indications.
PubChem Assay
On the one hand, there is a lot of data in here. On the other hand, the data are very noisy, perhaps so noisy as to be useless. Still, we can tell our user about this data, and let him or her decide about it, by linking back to the original source.
SEA
Formerly called WINC, these are predictions made by SEA against all purchasable molecules (ZINC) based on publicly available data (ChEMBL).