How to use Smallworld Java Command Line: Difference between revisions

From DISI
Jump to navigation Jump to search
Line 29: Line 29:
:::{| class="wikitable"
:::{| class="wikitable"
|-
|-
! Parameter
! Option
! Type
! Type
! Description
! Description
|-
|-
| smi
| -b
| text
| integer
| Query SMILES
| Benchmark mode, optionally specify the number of times to repeat: -b=5 (default: 0)
|-
|-
| db
| -d
| text
| integer
| Reference database (see List of available maps)
| Max topological distance (default: 99)
|-
|-
| dist
| --db
| int
| string
| Topological distance upper bound (default: 10)
| Database map file
|-
|-
| tdn
| -g
| int
| integer
| Topological terminal down distance upper bound (default: 10)
| Max number of generations during query extension (default: 3)
|-  
|-  
| tup
| -k
| int
| integer
| Topological terminal up distance upper bound (default: 10)
| Number of top hits to display (default: 0)
|-
|-
| rdn
| -n
| int
| int
| Topological ring down distance upper bound (default: 10)
| Topological ring down distance upper bound (default: 10)

Revision as of 02:21, 22 February 2023

Introduction

Here is a brief example on how to use the java command line.

How to use Smallworld Java Command Line

  1. Use bash as your shell
    • bash //just type in bash and enter
  2. Export these variables
    • export SWDIR=/mnt/nfs/db3/public_smallworld_5th_gen
      export sw='java -jar /nfs/db3/smallworld-5.5/sw.jar'
  3. Run the similarity command and it will display what it can do
    • $sw sim
  4. The path of the databases you can search from is here
    • /mnt/nfs/db3/public_smallworld_5th_gen/maps/
    • /mnt/nfs/db3/private_smallworld_5th_gen/maps/
    • /mnt/nfs/db3/super_private_smallworld_5th_gen/maps/
  5. Here is an example of a basic usage
    • $sw sim 'c1ccccc1' -db /mnt/nfs/db3/public_smallworld_5th_gen/maps/all-zinc.smi.anon.map
  6. Here is an example of an advanced usage
    • $sw sim 'c1ccccc1' --tdn 0 --rdn 0 --ldn 0 -db /mnt/nfs/db3/public_smallworld_5th_gen/maps/all-zinc.smi.anon.map -n 0 -d 2
  7. Here is a table of the options that could be useful
Option Type Description
-b integer Benchmark mode, optionally specify the number of times to repeat: -b=5 (default: 0)
-d integer Max topological distance (default: 99)
--db string Database map file
-g integer Max number of generations during query extension (default: 3)
-k integer Number of top hits to display (default: 0)
-n int Topological ring down distance upper bound (default: 10)
rup int Topological ring up distance upper bound (default: 10)
ldn int Topological linker down distance upper bound (default: 10)
lup int Topological linker up distance upper bound (default: 10)
scores text List of scoring functions (default : none) - see /search/config

Example Java Command Line Script

This script performs a similarity search on all databases in the public or private smallworld.

Usage: bash cmd_sim.sh <smiles text file> <library> <max_hits> <distance>

-- <smiles text file> | a text file of smiles in this format:
<smiles> <name of molecule>
-- <library> | public or private
-- <max_hits> | needs an integer, putting 0 means show all results
-- <distance> | needs an integer"

Example: bash cmd_sim.sh smiles.txt public 0 2

#!/bin/bash

smi_file=$1
library=$2
max_hits=$3
distance=$4

version="5.5"

sw_dir=/mnt/nfs/db3
sw='java -jar /nfs/db3/smallworld-'$version'/sw.jar'
public_dir=${sw_dir}/public_smallworld_5th_gen/
private_dir=${sw_dir}/private_smallworld_5th_gen/

public_maps=${public_dir}maps/*.anon.map
private_maps=${private_dir}maps/*.anon.map

if [ "$smi_file" = "-h" ] || [ "$smi_file" = "--help" ]
then
        printf '%*s\n' "${COLUMNS:-$(tput cols)}" '' | tr ' ' -
        echo "This script performs a similarity search on all databases in the public or private smallworld."
        printf '%*s\n' "${COLUMNS:-$(tput cols)}" '' | tr ' ' -
        echo "Usage: bash cmd_sim.sh <smiles text file> <library> <max_hits>  <distance>
        -- <smiles text file> | a text file of smiles in this format:
                <smiles> <name of molecule>
        -- <library> | public or private
        -- <max_hits> | needs an integer, putting 0 means show all results
        -- <distance> | needs an integer"
        printf '%*s\n' "${COLUMNS:-$(tput cols)}" '' | tr ' ' -
        echo "Example:
        bash cmd_sim.sh smiles.txt public 0 2"
        printf '%*s\n' "${COLUMNS:-$(tput cols)}" '' | tr ' ' -
else
while IFS= read -r line
do
        smiles=$(echo $line | awk '{print $1}')
        name=$(echo $line | awk '{print $2}')
        echo $name $smiles
        if [ $library == "public" ]
                then
                        export SWDIR=$public_dir
                        for maps in $public_maps
                                do
                                        echo $maps
                                        $sw sim -db $maps -v -n$max_hits -d$distance -score AtomAlignment:SMILES $smiles
                                done
        elif [ $library == "private" ]
                then
                        export SWDIR=$private_dir
                        for maps in $private_maps
                                do
                                        echo $maps
                                        $sw sim -db $maps -v -n$max_hits -d$distance -score AtomAlignment:SMILES $smiles
                                done
        else
                echo "Invalid Options"
        fi

done < $smi_file
fi