Looping solution
Jump to navigation
Jump to search
Looping Solution
Here is our approach to running jobs at scale on coreHPC.
- 1. on cluster 7 there is a directory with job control language
- 2. there is a cronjob that runs on corehpc.
- 3. once an hour, it iterates over all jobs in the system, and moves them forward.
- 4. the job status of each job is below.
- 5. the supported job control language is below.
job status
- 0 new job request
- 1 job data has started copying to fac
- 2 job data copy to fac has completed successfully
- 12 job data copy to fac failed (and requires attention)
- 3 job has been submitted on coreHPC
- 4 job is running on coreHPC
- 5 job has completed on coreHPC
- 15 job has failed in coreHPC (and requires attention)
- 6 scp of job results to C7 has started
- 7 scp of job results to C7 has completed normally
- 17 scp of job results to C7 has failed (and requires attention)
supported job control language (JCL)
- Each job has a file. The command syntax is as follows
type: docking, chemstep, other dockfiles: /nfs/exe/work/jji/projects/target1/dockfiles (includes INDOCK) tranches: /nfs/exe/work/jji/projects/target1/tranches (one file per jobs, circa 1 TB each)