Docking in AWS With DOCK 3.8
Environment Variable Input
description: The base directory where intermediate input/output will be stored. Recommended /dev/shm, default is /tmp
description: Should be either a directory or file hosted on s3
If S3_INPUT_LOCATION is a directory:
Input will be taken from the AWS_BATCH_JOB_ARRAY_INDEX'th file listed under the directory
AWS_BATCH_JOB_ARRAY_INDEX=3 S3_INPUT_LOCATION=s3://mytestbucket/input >>> aws s3 ls $S3_INPUT_LOCATION/ | sort -k4 >>> 2021-03-12 17:14:26 249815112 H25.db2.tgz >>> 2021-03-12 17:14:26 255685176 H26.db2.tgz >>> 2021-03-12 11:40:43 307920266 H27.db2.tgz
In this example, s3://mytestbucket/H27.db2.tgz would be evaluated
If S3_INPUT_LOCATION is a file:
Input will be downloaded from the AWS_BATCH_JOB_ARRAY_INDEX'th line in the file
AWS_BATCH_JOB_ARRAY_INDEX=2 S3_INPUT_LOCATION=s3://mytestbucket/input/inputlist_batch1.txt ### inputlist_batch1.txt s3://mytestbucket/input/H26.db2.tgz s3://mytestbucket/input/H25.db2.tgz s3://mytestbucket/input/H27.db2.tgz ### EOF
In this example, the job would evaluate s3://mytestbucket/input/H25.db2.tgz
description: A directory hosted on s3 for output to be stored at.
Each job will create a subdirectory within this directory according to its array index for OUTDOCK & test.mol2.gz files.
For example, a job with S3_OUTPUT_LOCATION=s3://mytestbucket/output and AWS_BATCH_JOB_ARRAY_INDEX=5 may produce the following files after the first run:
s3://mytestbucket/output/5/OUTDOCK.0 s3://mytestbucket/output/5/test.mol2.gz.0 s3://mytestbucket/output/5/restart ### potentially, if job is interrupted
description: An s3 directory containing dockfiles, e.g receptor grids, parameters, spheres, etc. INDOCK must be present in these files. The directory is recursively copied to local storage for each job, so make sure only the necessities are in this directory.
description: The name of the current job. Automatically supplied when submitting via aws batch.
description: The array index of the current job. Automatically supplied when submitting via aws batch.
Optional: AWS credentials
If you are testing outside of AWS batch (or even within aws batch, but passing credentials around like that can be dangerous), you may want to provide aws account credentials for accessing any necessary s3 buckets. You can do this by supplying the following as environmental parameters:
If you are submitting within aws batch, you may want to consider adding s3 permissions to your ecsInstanceRole (or whatever role you are using for aws batch instances) in lieu of an explicit access key.
Restartability & Spot Instances
This image has been configured to take advantage of AWS batch w/spot instances. When a job is interrupted by a spot termination notice, it will exit gracefully and report any partial results along with a restart marker in the output. You can also manually interrupt a job by sending signal 10 (SIGUSR1) to the dock process running within the container.
When a job completes successfully, the exit code is 0.
When a failure is detected, or if an interruption notice has been caught, the exit code is 1.
In our aws batch configuration, we tell jobs to re-attempt when they exit with code 1, up to a limit of 3-5 attempts. (3 attempts is generally the highest number we observe)