LoadingZINC21: Difference between revisions

From DISI
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
=== Load 2d ===
=== Load 2d ===
''' Scripts mentioned below are saved in /nfs/home/khtang/exa/ZINC21_load2d/zinc21_load_scripts '''
export SCRIPT='/nfs/home/khtang/exa/ZINC21_load2d/zinc21_load_scripts'
source $SCRIPT/loadenv_zinc21.sh
1a. Extract smiles from sdf
csh $SCRIPT/1a_smiles_extraction.csh <sdf> <id_field_name>
(Optional) 1b. Price filter
csh $SCRIPT/1b_filter.csh <csv> <smallest pack size price field name> <shortname> #shortname shouldn't be contain any price suffix ie chbr, chbr-v
2. Submit job
mkdir <short_name>
mv <smiles file> <short_name>/<short_name>.ism
cd <short_name>
sh /nfs/exa/work/khtang/ZINC21_load2d <short_name>.ism
3. Process outputs
csh $SCRIPT/3_process_outputs.csh
4. Depletion
source $SCRIPT/loadenv_zinc21.sh
csh $SCRIPT/4_depletion.csh <full/increment>
=== Load 2d (OLD) ===


1. Extract smiles from sdf done on csh
1. Extract smiles from sdf done on csh
source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
  source ~jji/cshrc.save
  source ~jji/cshrc.save
  pc2unix <sdf>
  pc2unix <sdf>
Line 8: Line 31:


2. Submit job  
2. Submit job  
source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
  mkdir <short_name>
  mkdir <short_name>
  mv <smiles file> <short_name>/<short_name>.ism
  mv <smiles file> <short_name>/<short_name>.ism
Line 21: Line 45:


  #source ZINC21 envs **must run on bash shell**
  #source ZINC21 envs **must run on bash shell**
  sh /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
  source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
  zinc-manage -e admin admin catalogs deplete -C 10000 <short_name> list2
  zinc-manage -e admin admin catalogs deplete -C 10000 <short_name> list2
  csh /nfs/ex9/work/khtang/move.csh
  csh /nfs/ex9/work/khtang/move.csh

Revision as of 17:08, 2 December 2022

Load 2d

Scripts mentioned below are saved in /nfs/home/khtang/exa/ZINC21_load2d/zinc21_load_scripts

export SCRIPT='/nfs/home/khtang/exa/ZINC21_load2d/zinc21_load_scripts'
source $SCRIPT/loadenv_zinc21.sh

1a. Extract smiles from sdf

csh $SCRIPT/1a_smiles_extraction.csh <sdf> <id_field_name>

(Optional) 1b. Price filter

csh $SCRIPT/1b_filter.csh <csv> <smallest pack size price field name> <shortname> #shortname shouldn't be contain any price suffix ie chbr, chbr-v

2. Submit job

mkdir <short_name>
mv <smiles file> <short_name>/<short_name>.ism
cd <short_name>
sh /nfs/exa/work/khtang/ZINC21_load2d <short_name>.ism

3. Process outputs

csh $SCRIPT/3_process_outputs.csh

4. Depletion

source $SCRIPT/loadenv_zinc21.sh
csh $SCRIPT/4_depletion.csh <full/increment>

Load 2d (OLD)

1. Extract smiles from sdf done on csh

source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
source ~jji/cshrc.save
pc2unix <sdf>
source ~teague/virtualenvs/zinc/env.csh
zincload-sdf --id-field <id_field> --name <catalog_name> <sdf>

2. Submit job

source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
mkdir <short_name>
mv <smiles file> <short_name>/<short_name>.ism
cd <short_name>
sh /nfs/exa/work/khtang/ZINC21_load2d <short_name>.ism

3. Depletion

find outputs -name '51-*-ids.tsv' | xargs sort -n -u > list
find outputs -name '*.filtered' | xargs cat > filtered
find outputs -name '14-neutralize.log' |xargs cat | grep -v processed > errors
sort -n  list > list2
wc -l list filtered errors list2 //check if most of molecules are successfully loaded on ZINC
#source ZINC21 envs **must run on bash shell**
source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
zinc-manage -e admin admin catalogs deplete -C 10000 <short_name> list2
csh /nfs/ex9/work/khtang/move.csh