DOCK memory and CPU benchmark

From DISI
Revision as of 19:38, 25 July 2022 by Btingle (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Excerpt from email between Ben Tingle and Jiankun Lyu:

Ben

JK- I have some more info on how threads/cpus work. It seems that utilizing threads does give more power, but there are diminishing returns. Here are my findings:

I experimented on a machine with 40 cores & 80 threads. When running 80 in parallel, the result comes out about 50%(+/-15) faster than running 40 in parallel. You can think of it as 1 core = 1.5 actual cores, or 1 thread = 0.75 of an "actual" core.

So a good rule of thumb would be to multiply the number of cores by 1.5 to get the actual parallel potential.

Now for memory:
Across all docking runs, DOCK never exceeded 311 MB, with an average memory consumption of 219.5 MB. I would say allowing 512 MB per DOCK thread would be a safe value, with some additional memory reserved for the operating system etc.

JK

what should I change in  my subdock.csh to use this feature?

Ben

This isn't a new feature, I'm just testing how DOCK does in different parallel contexts; in this example I am running 40 dock jobs at once and comparing to running 80 dock jobs at once. This is to get a more accurate estimate of how much compute power is actually available for docking given N cores.

I did add some new features to rundock/subdock to test this though, including the following:
1. I made it so you can target regular db2.gz files instead of db2.tgz packages
2. Can use GNU parallel command instead of slurm/sge for subdock
3. rundock will now feed files to DOCK through standard input, which makes things more efficient I/O wise
4. Performance statistics get collected and sent to the output directory

I will upload these changes to github soon.

To be clear- when you submit to slurm/sge currently, you are getting the maximum amount of throughput possible

JK

I see, so we are already using this parallel power
it's just that we don't know this before
we don't know the ratio number before