Chembl2pdb

From DISI
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

CURRENT DATA

__ Updated 02/24/2011 __

The current data relating the ChEMBL09 protein targets to structures on the PDB can be found at:

/raid3/people/mysinger/pxc/pdb_to_chembl/chembl09

There are 3 subfolders:

 - uniprot: categorized by target uniprot id
        
 - pdb_ligand: all pdb codes that have a bound ligand (as defined by be_blasti.csh script from DOCKBlaster)
                   with the corresponding activity data from ChEMBL (actives.smi)
         
 - pdb_other: all pdb codes that do NOT have a bound crystal ligand (as defined by be_blasti.csh script from DOCKBlaster) 
                     with the corresponding actives from chEMBL(actives.smi)

In order to get some statistics: how many pdb codes, how many targets have ChEMBL ligands, you can simply count the number of subfolders in each "byXXX" folder.

 eg: How many UniProt targets have ChEMBL ligands?
       % cd uniprot
       % wc -l uniprot
        
 eg: How many pdb structures have ChEMBL actives and a bound crystal ligand?
       % cd bypdb_ligand/
       % ls -d ????| wc -l
 
 eg: How many pdb structures have ChEMBL actives BUT WITHOUT a bound crystal ligand?
       % cd pdb_other/
       % ls -d ???? | wc -l

GENERATION PROCEDURE

In future, if you want to generate the data again, you need to do the following:

  • Step I: Load new ChEMBL SQL database into zincdb1 ( do this only if there is a new ChEMBL release)
  • Step II.: Make a new directory, run the script pointing to the new sql database name, and wait a day or two for it to finish
         mkdir chembl10
         cd chembl10
         /raid3/people/mysinger/pxc/pdb_to_chembl/generate_chembl_map.csh chembl10