Performing a Query on 22B Molecules

From DISI
Jump to navigation Jump to search

Introduction

Say you are a scientist, and you need to find all the molecules you can that match a specific query. To start, you could go to the arthor round table server (10.20.10.136:8080/arthor-rt-host) and perform your query on each of the databases numbered 1 -> 46. When combined these databases contain over 22 Billion molecules! The problem is that it can be quite time consuming to query each database and download the results by hand-- there has to be a better way!

Luckily enough, there is. Introducing the round table manager web app!

Logging in to the Round Table Manager

Currently, the round table manager is hosted at 10.20.5.35:8010. You must sign up first, then you will be able to query/upload databases. Please contact me at btingle@mail.sfsu.edu if you need your account to be approved.

Manager-2.PNG

Performing a query on multiple databases

There are two different query options you can choose from in the manager; "QUERY/SIM" and "QUERY/SUB". Navigating to "QUERY/SIM" will allow you to perform a similarity query, whereas "QUERY/SUB" is for every other type of query- Substructure, SMARTS, or Molecular Formula. Once you've input your query string and selected the query type you may select which databases you want to perform the query on. Press "submit" and you will be redirected to the /jobs page, where you can see the current status of your query request. Once your query has finished, click on your query job, then click the "GET RESULTS" link above it's job entry, this will yield you a file containing the full results of your query.

Manager-2-query.PNG

Uploading a new Database

From the index, navigate to /upload. On this page you can select a SMILES database to upload to the round table network. You can either upload the file from your browser (No file over 1GB are allowed), or you can specify a path on the NFS to upload from. Manager-2-upload.PNG

Building a new Index

Different types of queries require different types of indexes to be built for the database. Queries under "QUERY/SIM" require ".atfp" indexes, whereas queries under "QUERY/SUB" require ".atdb" indexes. If a database doesn't have the indexes it needs, it can't be queried. When you upload a file to arthor, it will automatically generate the indexes for this file- be warned that this may take a long time, so don't expect immediate results.