LoadingZINC21

From DISI
Revision as of 17:08, 2 December 2022 by Khtang (talk | contribs) (→‎Load 2d)
Jump to navigation Jump to search

Load 2d

Scripts mentioned below are saved in /nfs/home/khtang/exa/ZINC21_load2d/zinc21_load_scripts

export SCRIPT='/nfs/home/khtang/exa/ZINC21_load2d/zinc21_load_scripts'
source $SCRIPT/loadenv_zinc21.sh

1a. Extract smiles from sdf

csh $SCRIPT/1a_smiles_extraction.csh <sdf> <id_field_name>

(Optional) 1b. Price filter

csh $SCRIPT/1b_filter.csh <csv> <smallest pack size price field name> <shortname> #shortname shouldn't be contain any price suffix ie chbr, chbr-v

2. Submit job

mkdir <short_name>
mv <smiles file> <short_name>/<short_name>.ism
cd <short_name>
sh /nfs/exa/work/khtang/ZINC21_load2d <short_name>.ism

3. Process outputs

csh $SCRIPT/3_process_outputs.csh

4. Depletion

source $SCRIPT/loadenv_zinc21.sh
csh $SCRIPT/4_depletion.csh <full/increment>

Load 2d (OLD)

1. Extract smiles from sdf done on csh

source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
source ~jji/cshrc.save
pc2unix <sdf>
source ~teague/virtualenvs/zinc/env.csh
zincload-sdf --id-field <id_field> --name <catalog_name> <sdf>

2. Submit job

source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
mkdir <short_name>
mv <smiles file> <short_name>/<short_name>.ism
cd <short_name>
sh /nfs/exa/work/khtang/ZINC21_load2d <short_name>.ism

3. Depletion

find outputs -name '51-*-ids.tsv' | xargs sort -n -u > list
find outputs -name '*.filtered' | xargs cat > filtered
find outputs -name '14-neutralize.log' |xargs cat | grep -v processed > errors
sort -n  list > list2
wc -l list filtered errors list2 //check if most of molecules are successfully loaded on ZINC
#source ZINC21 envs **must run on bash shell**
source /nfs/exa/work/khtang/ZINC21_load2d/loadenv_zinc21.sh
zinc-manage -e admin admin catalogs deplete -C 10000 <short_name> list2
csh /nfs/ex9/work/khtang/move.csh