How to use Arthor API

From DISI
Revision as of 20:50, 16 February 2023 by Jgutierrez6 (talk | contribs) (→‎Data Tables)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

This information was taken from the Arthor version 2.1.2 documentation. RESTful API: The current server implementation utilizes the Data Tables JS library to display hits. The current API therefore reflects some requirements of this library and all URLs are organised under the /dt/ (for data tables) path, in future there may be new URLs introduced to match other clients (e.g. /jchem/).

Data Tables

/dt/data

List all available databases and the types of search available, note a “SUB” in idxTypes means both “SUB” and “SMA” search can be performed:

Example command:

   curl http://arthor.docking.org/dt/data

Produces the following JSON:

   [
    {"displayName":"wait-ok.smi","location":"/usr/local/tomcat/arthor_data/wait-ok.smi","urlFormatStr":null,"idxTypes":["SUB"]},
    {"displayName":"for-sale.smi","location":"/usr/local/tomcat/arthor_data/for-sale.smi","urlFormatStr":null,"idxTypes":["SUB"]},
    {"displayName":"now_bb.smi","location":"/usr/local/tomcat/arthor_data/now_bb.smi","urlFormatStr":null,"idxTypes":["SIM","SUB"]},
    {"displayName":"wait-ok_bb.smi","location":"/usr/local/tomcat/arthor_data/wait-ok_bb.smi","urlFormatStr":null,"idxTypes":["SIM","SUB"]},
    {"displayName":"for-sale_bb.smi","location":"/usr/local/tomcat/arthor_data/for-sale_bb.smi","urlFormatStr":null,"idxTypes":["SIM","SUB"]},
    {"displayName":"in-stock.smi","location":"/usr/local/tomcat/arthor_data/in-stock.smi","urlFormatStr":null,"idxTypes":["SIM","SUB"]},
    {"displayName":"bb-all.smi","location":"/usr/local/tomcat/arthor_data/bb-all.smi","urlFormatStr":null,"idxTypes":["SIM","SUB"]},
    {"displayName":"bb-now.smi","location":"/usr/local/tomcat/arthor_data/bb-now.smi","urlFormatStr":null,"idxTypes": ["SIM","SUB"]}, 
    {"displayName":"interesting.smi","location":"/usr/local/tomcat/arthor_data/interesting.smi","urlFormatStr":null,"idxTypes":["SIM","SUB"]},
    {"displayName":"on-demand.smi","location":"/usr/local/tomcat/arthor_data/on-demand.smi","urlFormatStr":null,"idxTypes":["SIM","SUB"]},
    {"displayName":"wuxi.smi","location":"/usr/local/tomcat/arthor_data/wuxi.smi","urlFormatStr":null,"idxTypes":["SIM","SUB"]}
   ]

/dt/${db_name}/data

Access information on a single database, the virtual memory status is also reported for Similarity and Substructure indexes:

/dt/${db_name}/search/

Search a database with a query SMILES/SMARTS.

${db_name} Required path variable, the database name is specified in the URL path as ${db_name}. For one of the databases above, ‘ChEMBL 23’ would be searched as /dt/ ChEMBL%2023/search. query=<string> The query to run either a valid SMILES or SMARTS, depending on search type. type=SUB|SIM|SMA The search type to perform (SUB=Substructure, SMA=SMARTS, SIM=Similarity), these primarily differ by what input is expected in the query= string. Both “SUB” and “SIM” expect a valid SMILES to be provided, “SMA” expects a SMARTS. The SMILES provided to “SUB” is aromatized to be consistent with the database (if flag provided) and converted to a query with any flags specified. This parameter is optional, the default is “SIM”.

start=<num> Optional start offset for the result set to allow paging. The default value is 0 mean- ing the result start at the first hit. length=<num> Optional length of the result set to allow paging. The default is 10. draw=<num> Echoed value to maintain consistent ordering of asynchronous events. A number is provided to this parameter that is returned with the result set. If when the response is received by the client the draw value doesn’t match the current draw value the result can be considered “out of date” and should be ignored. This value is client specific and optional. qopts=<opts> Specify the query options to run the Substructure or SMARTS search with. For example “qopts=RC” would lock the rings and chains of the query. See Arthor::ParseQuery for more info. If your query is an MDL file then you must specify “Mdl” in the qopts.

   //matches server side constants
   let QueryFlags = {
       AROMATISE:    0x0100,
       LOCK_RINGS:    0x0200,
       LOCK_CHAINS:  0x0400,
       LOCK_CHARGES:    0x0800
       LOCK_ISOTOPES:    0x1000,
       LOCK_CONNECTIVITY  0x2000
    };
   //matches UI
   let DEFAULT_FLAGS = QueryFlags.AROMATISE |
                                        QueryFlags.LOCK_CHARGES |
                                         QueryFlags.LOCK_ISOTOPES |
                                         QueryFlags.LOCK_CONNECTIVITY;

To "lock" means don't allow it to change

Response: The search returns a JSON object containing the result set as a data property

Similarity Search

   curl 'http://arthor.docking.org/dt/In-Stock-19Q4-13.8M/search?query=c1ccccc1&type=SIM&start=0&length=5'
   {"query":"c1ccccc1","
       data":[
           [1,"c1ccccc1\tZINC000000967532","1.000"],
           [2,"c1ccccccccccccccccc1\tZINC000100074375","1.000"],
           [3,"c1ncncn1\tZINC000001718513","0.498"],
           [4,"c1cncnc1\tZINC000000895216","0.427"],
           [5,"Ic1cc(I)cc(I)c1\tZINC000057266212","0.373"]],
           "draw":0,"recordsFiltered":1649789,"recordsTotal":1649789,"time":79,"hasMore":false,"error":null
   }


Substructure or SMARTS Search

A substructure or SMARTS search has some additional complexity. A parameter hasMore indicates whether there are more results. When the first page for a query is requested a background counter is spun up to count the total number of hits. The idea is the server can be pooled until the count is finished.

   curl 'http://arthor.docking.org/dt/In-Stock-19Q4-13.8M/search?query=c1ccccc1&type=SUB&start=0&length=5'
   {"query":"c1:c:c:c:c:c:1","data":[
         [1,"C[C@@]1(c2ccccc2)OC(C(=O)O)=CC1=O\tZINC000000000010",""],
         [2,"COc1cc(Cc2cnc(N)nc2N)cc(OC)c1N(C)C\tZINC000000000011",""],
         [3,"O=C(C[S@@](=O)C(c1ccccc1)c1ccccc1)NO\tZINC000000000012",""],
         [4,"CCC[S@](=O)c1ccc2[nH]/c(=N\\C(=O)OC)[nH]c2c1\tZINC000000000017",""],
         [5,"C=C(C)CNc1ccc([C@H](C)C(=O)O)cc1\tZINC000000000022",""]],
        "draw":0,"recordsFiltered":1162074,"recordsTotal":1162074,"time":1090,"hasMore":false,"error":null
   }

After some time the background count will have completed and the hasMore will now be false meaning the hit count is correct. The time taken to do the count is set on the time field.