New 3D Building On Wynton: Difference between revisions
Jump to navigation
Jump to search
(Created page with " <nowiki> have re-vamped the script for the 2nd time, this time configured to use SGE on Wynton. the script is based in ~/zinc-3d-build-3 on jji@wynton I've had to re-install/...") |
(No difference)
|
Revision as of 03:27, 16 July 2020
have re-vamped the script for the 2nd time, this time configured to use SGE on Wynton. the script is based in ~/zinc-3d-build-3 on jji@wynton I've had to re-install/reconfigure some of the software as it was not working properly on the wynton cluster This software has been installed in various places in $HOME The output of both the script results and the log files are organized in a similar fashion, which I will explain There is one script of interest for running jobs, and this is submit-all-jobs.bash. This script takes in a source SMILES file and an output destination. The script will then submit a number of jobs to build 3D ligand data and save results in an organized fashion to the output destination Each job submitted by the script works on a batch of 100 substances. A group of 10,000 substances, or 100 jobs, is called a "batch" Each 100 SMILES read in by the script is assigned a batch no. based on it's position in the source file ex: smiles | ZINC ID | line no. | batch no. ======================================= CCAA | ZINC000 | 0 | 0 ... CCZZ | ZINCaaa | 10,000 | 1 ... CCXX | ZINCbbb | 20,000 | 2 ... CCYY | ZINCccc | 30,000 | 3 basically, BATCH_NO=LINE_NO/10000 Each job saves its results tarball to /wynton/scratch/jji/$SRC_FILENAME/$BATCH_ID/$END_ID.tar.gz Each job saves its log stdout and stderr to /wynton/home/shoichetlab/jji/zinc-3d-build-3/logs/$SRC_FILENAME/$BATCH_ID/$END_ID.* These directories can be re-configured by changing environment variables OUTPUT_DEST and LOG_BASE_DIR respectively prior to running the submit-all-jobs script $SRC_FILENAME is the filename of the source file this group of jobs was run from $BATCH_ID is the batch no. of the smiles $END_ID is the line no. of the last substance in the job