DUMM1: Difference between revisions

From DISI
Jump to navigation Jump to search
m (32 revisions)
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Attendees ==
== Attendees & Time==


Sarah, Alessandro, Francesco, Hao, Michael, Pascal, Peter
Sarah, Alessandro, Francesco, Hao, Michael, Pascal, Peter
Date: June 04, 2008
'''Next DUM on Wednesday, July 2, 2008'''


== Hardware updates==
== Hardware updates==


''Pascal:'' 8x44 "[http://insider.ucsf.edu/2007/july/spotlight_4.html Fight for Mike]" machines will arrive shortly. Problem: how to coordinate the high number of machines to write to disk w/o affecting the other users ⇒ what gets written to disk during a [[DOCK]] run?
''Pascal:'' 8×44 "[http://insider.ucsf.edu/2007/july/spotlight_4.html Fight for Mike]" machines will arrive shortly. Problem: how to coordinate the high number of machines to write to disk w/o affecting the other users ⇒ what gets written to disk during a [[DOCK]] run?
<br>
<br>
''Michael:'' three big files: test.eel1, test.1 and OUTDOCK. test.1 should be optional. test.eel1 and OUTDOCK are gzipped in the most recent version of DOCK (part of [[dock67]]). Problem: [[DOCK]] writes test.* at the end of the run &rArr; maybe streamline the process?
''Michael:'' three big files: test.eel1, test.1 and OUTDOCK. test.1 should be optional. test.eel1 and OUTDOCK are gzipped in the most recent version of DOCK (part of [[dock67]]). Problem: [[DOCK]] writes test.* at the end of the run &rArr; maybe streamline the process?
<br>
<br>
''Pascal:'' solution is probably to write to scratch and then cp files over at the end.
''Pascal:'' solution is probably to write to scratch and then cp files over at the end.
''John:'' Michael's gzipped output helps. Also set restart_interval to big value (100000) to prevent heavy disk writes.  As of now, the new machines seem to have been absorbed with no performance problems.


==Software updates==
==Software updates==


'''dock67''' | '''solvmap'''
===dock67 | solvmap===


''Michael:'' [[dock67]] has been released. It contains a new version of <tt>[[solvmap]]</tt>, which has a bugfix for the "blank lines bug" that caused the solvation penalties to be wrong. The output of the new version can be recognized by the fact that the last three numbers on the first line are floats.  
''Michael:'' [[dock67]] has been released. It contains a new version of <tt>[[solvmap]]</tt>, which has a bugfix for the "blank lines bug" that caused the solvation penalties to be wrong. The output of the new version can be recognized by the fact that the last three numbers on the first line are floats.  


'''runAMSOL3.csh'''
===runAMSOL3.csh===


''Michael:'' use <tt>runAMSOL3.csh WAIT</tt> to use [http://comp.chem.umn.edu/amsol AMSOL] when preparing database files, so you get the correct cavitation term.
''Michael:'' use <tt>runAMSOL3.csh WAIT</tt> to use [http://comp.chem.umn.edu/amsol AMSOL] when preparing database files, so you get the correct cavitation term.


'''file2file.py'''
===file2file.py===


''Michael:'' <tt>file2file.py</tt> has been improved: uses [http://www.tripos.com/mol2/atom_types.html SYBYL atom types]; charged molecules are handled correctly.
''Michael:'' <tt>file2file.py</tt> has been improved: uses [http://www.tripos.com/mol2/atom_types.html SYBYL atom types]; charged molecules are handled correctly.


'''dbgen.csh'''
===dbgen.csh===


''Michael:'' more tools for database generation are available: <tt>dbgen.csh</tt> uses the most recent version of the procedure without using [http://zinc.docking.org ZINC], ensuring they are all generated from scratch. Commandline: <tt>dbgen.csh my.smi [optional protonation type]</tt>
''Michael:'' more tools for database generation are available: <tt>dbgen.csh</tt> uses the most recent version of the procedure without using [http://zinc.docking.org ZINC], ensuring the molecules are all generated from scratch. Commandline: <tt>dbgen.csh my.smi [optional protonation type]</tt>


'''decoys.py'''
===decoys.py===


''Michael:'' <tt>decoys.py</tt> builds [[DUD]]-style set of decoys. Help accessible with <tt>decoys.py -h</tt>. Executable can be found at <tt>~mysinger/code/dud/trunk/decoys.py</tt>.  
''Michael:'' <tt>decoys.py</tt> builds [[DUD]]-style set of decoys. Help accessible with <tt>decoys.py -h</tt>. Executable can be found at <tt>~mysinger/code/dud/trunk/decoys.py</tt>.  


'''database generation'''
===database generation===


''Michael:'' if one has to generate large amounts of database files, you can put the jobs on the cluster. Use the script <tt>dbstart.csh</tt> on sgehead to start the jobs and <tt>dbend.csh</tt> to collect them when they are done.
''Michael:'' if one has to generate large amounts of database files, you can put the jobs on the cluster. Use <tt>dbstart.csh ./dbname.smi</tt> on sgehead to start the jobs and <tt>dbend.csh</tt> to collect them when they are done. If <tt>dbname.smi</tt> contains more than 1000 molecules, break it down to smaller subparts.


==[[DOCK]] updates ==
==[[DOCK]] updates ==


'''dock67'''
===dock67===


''Michael:'' the most recent version of [[DOCK]] (part of [[dock67]]) gzips test.eel1 and OUTDOCK.
''Michael:'' the most recent version of [[DOCK]] (part of [[dock67]]) gzips test.eel1 and OUTDOCK.


'''Requests/Improvements'''
===Requests/Improvements===


''Hao:'' it might be good to include an option in [[DOCK]] to be able to write out more than one pose per molecule.<br>
''Hao:'' it might be good to include an option in [[DOCK]] to be able to write out more than one pose per molecule.<br>
Line 50: Line 56:
== Varia ==
== Varia ==


'''ZINC history'''
===ZINC history===


''Sarah:'' how to keep old [http://zinc.docking.org ZINC] versions and ensure that molecules are the same, especially in order to be able to reproduce calculations?
''Sarah:'' how to keep old [http://zinc.docking.org ZINC] versions and ensure that molecules are the same, especially in order to be able to reproduce calculations?
<br>
''John answers:'' ZINC 5, 6, 7 are all still on line and available.
<br>
<br>
''Pascal:'' is there a way to find out which tools were used to build a certain molecule?
''Pascal:'' is there a way to find out which tools were used to build a certain molecule?
Line 58: Line 66:
''Peter:'' how to rebuild molecules?
''Peter:'' how to rebuild molecules?
<br>
<br>
''Francesco:'' there is a script called <tt>rebuildit.pl</tt> which will schedule the molecules in question for rebuilding in [http://zinc.docking.org ZINC]. Commandline: <tt>rebuildit.pl list_of_zinc_ids</tt>.<br>
''Francesco:'' there is a script called <tt>rebuildit.pl</tt> which will schedule the molecules in question for rebuilding in [http://zinc.docking.org ZINC]. Commandline: <tt>rebuildit.pl < list_of_zinc_ids</tt>.<br>
''Michael:'' the molecule files are located on <tt>/raid2</tt>.
''Michael:'' the molecule files are located on <tt>/raid2/db/XX/YY/PROTXXYY.*</tt>. Note this is the protomer id and not the zinc id.


'''Knowledge managment'''
===Knowledge managment===


''Peter:'' how do we make software changes visible to the internal users?<br>
''Peter:'' how do we make software changes visible to the internal users?<br>
Line 67: Line 75:
''Pascal/Michael:'' make more consistent use of the CVS to omit consecutive numbering of scripts.
''Pascal/Michael:'' make more consistent use of the CVS to omit consecutive numbering of scripts.


--[http://shoichetlab.compbio.ucsf.edu/~kolb Kolb] 16:08, 9 June 2008 (PDT)
 
[[Category:DUMM]]
 
&rarr; back to [[Dock Users' Meeting Minutes (DUMM)|DUMM main page]]

Latest revision as of 20:11, 8 October 2012

Attendees & Time

Sarah, Alessandro, Francesco, Hao, Michael, Pascal, Peter

Date: June 04, 2008

Next DUM on Wednesday, July 2, 2008

Hardware updates

Pascal: 8×44 "Fight for Mike" machines will arrive shortly. Problem: how to coordinate the high number of machines to write to disk w/o affecting the other users ⇒ what gets written to disk during a DOCK run?
Michael: three big files: test.eel1, test.1 and OUTDOCK. test.1 should be optional. test.eel1 and OUTDOCK are gzipped in the most recent version of DOCK (part of dock67). Problem: DOCK writes test.* at the end of the run ⇒ maybe streamline the process?
Pascal: solution is probably to write to scratch and then cp files over at the end.

John: Michael's gzipped output helps. Also set restart_interval to big value (100000) to prevent heavy disk writes. As of now, the new machines seem to have been absorbed with no performance problems.

Software updates

dock67 | solvmap

Michael: dock67 has been released. It contains a new version of solvmap, which has a bugfix for the "blank lines bug" that caused the solvation penalties to be wrong. The output of the new version can be recognized by the fact that the last three numbers on the first line are floats.

runAMSOL3.csh

Michael: use runAMSOL3.csh WAIT to use AMSOL when preparing database files, so you get the correct cavitation term.

file2file.py

Michael: file2file.py has been improved: uses SYBYL atom types; charged molecules are handled correctly.

dbgen.csh

Michael: more tools for database generation are available: dbgen.csh uses the most recent version of the procedure without using ZINC, ensuring the molecules are all generated from scratch. Commandline: dbgen.csh my.smi [optional protonation type]

decoys.py

Michael: decoys.py builds DUD-style set of decoys. Help accessible with decoys.py -h. Executable can be found at ~mysinger/code/dud/trunk/decoys.py.

database generation

Michael: if one has to generate large amounts of database files, you can put the jobs on the cluster. Use dbstart.csh ./dbname.smi on sgehead to start the jobs and dbend.csh to collect them when they are done. If dbname.smi contains more than 1000 molecules, break it down to smaller subparts.

DOCK updates

dock67

Michael: the most recent version of DOCK (part of dock67) gzips test.eel1 and OUTDOCK.

Requests/Improvements

Hao: it might be good to include an option in DOCK to be able to write out more than one pose per molecule.
Michael: this is part of my orals proposal.

Varia

ZINC history

Sarah: how to keep old ZINC versions and ensure that molecules are the same, especially in order to be able to reproduce calculations?
John answers: ZINC 5, 6, 7 are all still on line and available.
Pascal: is there a way to find out which tools were used to build a certain molecule?
Peter: how to rebuild molecules?
Francesco: there is a script called rebuildit.pl which will schedule the molecules in question for rebuilding in ZINC. Commandline: rebuildit.pl < list_of_zinc_ids.
Michael: the molecule files are located on /raid2/db/XX/YY/PROTXXYY.*. Note this is the protomer id and not the zinc id.

Knowledge managment

Peter: how do we make software changes visible to the internal users?
Michael/Peter/Sarah: keep minutes and put them up on the wiki. Ideally include subpages and changelogs for the software items. Make sure that everyone has access to the latest versions of software/scripts/testcases for evaluation purposes.
Pascal/Michael: make more consistent use of the CVS to omit consecutive numbering of scripts.

→ back to DUMM main page