Loading ZINC12: Difference between revisions
m (1 revision) |
No edit summary |
||
Line 1: | Line 1: | ||
This is the internal page for loading ZINC. If you are not a ZINC curator, this page will probably not be interesting. | This is the internal page for loading ZINC. If you are not a ZINC curator, this page will probably not be interesting. | ||
= YYZ protocol = | |||
== 1. Acquire catalog, often as SDF == | |||
== 2. Parse SDF into ISM, harvesting data from SD tags into synonyms table == | |||
python parse_catalog.py ibsbb ibs2013oct_bb.sdf ibsbb.ism ibsbb.csv | |||
== 3. Desalt, default representation, filter out neverwanteds == | |||
* formerly: filter.py | |||
* now: ? | |||
python find_new_substances.py ibsbb ibsbb.ism catalog-item.csv | |||
== 4. Load substance, catalog_item == | |||
Nothing here yet | |||
== 5. Generate and load protomer == | |||
Nothing here yet. | |||
= old UCSF protocol = | |||
= Acquire = | = Acquire = |
Revision as of 15:50, 6 November 2013
This is the internal page for loading ZINC. If you are not a ZINC curator, this page will probably not be interesting.
YYZ protocol
1. Acquire catalog, often as SDF
2. Parse SDF into ISM, harvesting data from SD tags into synonyms table
python parse_catalog.py ibsbb ibs2013oct_bb.sdf ibsbb.ism ibsbb.csv
3. Desalt, default representation, filter out neverwanteds
- formerly: filter.py
- now: ?
python find_new_substances.py ibsbb ibsbb.ism catalog-item.csv
4. Load substance, catalog_item
Nothing here yet
5. Generate and load protomer
Nothing here yet.
old UCSF protocol
Acquire
1. Get the catalog from the vendor, usually in SDF. NB. we need to automate this step as much as possible
2. on nfshead2 in ~xyz/raid3/stage3/ or /raid6/tmp/xyz/ gunzip SDF into a directory
3. pc2unix '*.sdf'
4. make ism
foreach i (*.sdf) namesdf.pl '<TAG>' < $i > $i.sdf convert.py --i=$i.sdf --o=$i.ism
end
5. combine and move
cat vendoris*.ism > all mv all ~xyz/raid8/catalog/vendorid.in ln -s !$ vendorid.ism
6. mark depleted
deplete.pl vendorid < vendorid.ism
7. process on sgehead2
mas.csh vendorid vendorid.ism ; # nb may run for a long time!
nb periodically delete output
8. update filter info, error info
9. export database on nfshead5
cd ~xyz/raid8/byvendor/.temp
mkdir vendorid cd vendorid callgr17.csh
(may take a long time)
10. export database on nfshead5
extractthis.csh vendorid nodb mol2
NB db if annotated mol2 in all cases
11. if big, cluster on korn
kornit.csh vendorid `pwd`
12. finish off on wilco
./all.csh updateit.pl < log ./all.csh cd vendorid dosubset4.pl vendorid vendor
13. email vendor telling them their catalog has been updated in ZINC
14. write tweet, etc. announcing, if appropriate.