Oracle scraps

From DISI
Jump to navigation Jump to search

Trigger for calculating fields upon structure insert


Basic BIOACTIVITY call based on current ZINC infrastructure (example):
 select ID_STRUC_FK from BIOACTIVITY where TARGET_NAME=’Anandamide amidohydrolase’ AND ACTIVITY_DESC=’Binding, 1uM’;
 *Free flag as well?*
 AND FREE_FLAG=1

Populating PROTOMER with chemical terms:

ZINC descriptors as Oracle calls. Putting all these on PROTOMER for searching (also on STRUCTURES, but for benchmarking, pull from PROTOMER);
update PROTOMER
MW:
LogP
Net Charge
Rotatable bonds
PSA
H Donors
H Acceptors
PolDesolv
ApolarDesolv


PL/SQL calls for various Oracle/Cartridge components based on current ZINC architecture:

Structure searching - feed in SMILES
	select STRUCTURES_ID from STRUCTURES/PROTOMER where [call] =1
	jc_compare(STRUCTURE, ‘smiles’, options) Options separated by space
Similarity
‘t:i simThreshold:0.9’ (or dissimilarityThreshold)
Similarity metric
dissimilarity
Query by ‘t:t dissimilarityThreshold:0.6 dissimilarityMetric:tversky;0.3,0.7’
duplicate/identity (flag stereo matching)
‘t:d’ duplicate
index only (fingerprint, no atom matching) Speed difference?
‘t:na’
substructure
‘t:s’ (substructure search is default, doesn’t need to be specified)
superstructure
‘t:u’
fragment (query and target must have same heavy atom network for matching, all features otherwise treated as substructure).
‘t:f’ 
toggle options to consider:	
max hit count
maxHitCount:x
max time 
maxTime:x (x in milliseconds, default is unlimited)
early results (returned block size, provides fast results as first 100 returned for page, then can continue searching while visualization calls present the webpage)
fingerprint type ‘descriptorName’ - requires additional structure index fields
Different fingerprints can be generated through the addMd ALTER INDEX parameter and can be selected for use with descriptorName option
alter index jcxnci parameters('addDfltMdConf=ECFP') [added]
CF (default?), PF, BCUT, ECFP
absolute or exact stereo, stereo ignored, required, etc.
absoluteStereo:x
radicals, isotopes, charges (ignored, exact, default)
double bond stereo
doubltBondStereo:A
matching of implicit H
polymer options (if Marvin applet input provided)
tautomer searching: table level? Ability to specify yes or no?
To be used together with duplicate search (t:d). For jchem index operations, tdf:y option has to be set for the jchem index or the underlying JChem table.
If y, generic tautomer of query and target used in search.
chemical terms filter with substructure to apply things like MW, donorcount, ring count, and of the bajillion chemical terms offered.
Comparison: chemical terms as part of search, or part of INTERSECT from another table?
Combine with filterquery to limit number of compounds searched? Compare
R-atoms/homology groups if Marvin applets used to input structure
Bioactivity: Materialized view of biological activity. Relevant columns for searching:
ID_STRUC_FK: (Structure IDs)
TARGET_NAME
IC50
CHEMBL_ID
ACTIVITY_DESC: binding, etc
SOURCE
FREE_FLAG: 1/0
SWISSPROT
CHEMBL
UNIPROT
PH ranges of protomers
All 1/null flags on relevant columns: PH_HI/MID/LO. Original structure is PH_REF. If the compound is a protomer, the pH of max occupancy will be in PH_MAX_OCCUPANCY.
ZINC ID
Vendor
CATALOG_REF: VENDOR_NAME for CAT_REF_ID
CATALOG: (has items) CAT_REF_ID_FK to link vendors, ID_STRUC_FK to link to structures.

Oracle set operators read left to right unless parentheses say otherwise. INTERSECT ideal call vs. constructing AND sets. Oracle evaluates expressions inside parentheses before evaluating those outside.
	- put all other conditions in paranthesis, then structure search based on that returned set.

Precedence using filterquery!

SELECT count(*) FROM nci_10m WHERE jc_compare(structure, 'c1ccccc1', 'sep=! t:s!filterQuery:select rowid from nci_10m where projid = 502') = 1

	Build into filterquery all other statements, using INTERSECT

Evaluating compounds using chemical terms
select jc_evaluate_x(‘input’, ‘chemTerms:tautomers() outFormat:smiles’) from TABLE
useful for structure check before search?


Overall Call Structure

Substructure only search:
select STRUCTURES_ID from STRUCTURES where jc_compare(STRUCTURE, ‘query’, ‘t:s’)=1

Addition of filterquery utility (5.8):
select STRUCTURES_ID from STRUCTURES where jc_compare(STRUCTURE, ‘query’, ‘sel=! t:s!filterQuery:select logic’)=1


Thoughts
Family clustering on results page. eg instead of showing 25 stereoisomers, show a non-stereo compound and give it a ‘stack’ image with text saying “25 Stereoisomes”. Hover over and pop-up shows the isomers, then functions as normal. 
Subset definitions
Purchasability definitions (popup over label?)