Mol2db: Difference between revisions

From DISI
Jump to navigation Jump to search
m (making links correct)
No edit summary
 
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
mol2db is a program developed to combine multimol2 files into the [[Flexibase Format]] read by [[DOCK]].  
mol2db is a program developed to combine multimol2 files into the [[Flexibase Format]] read by [[DOCK 3.6]]. A standalone tool to decode the .db files produced is [[db2multipdb.py]].


The purpose of this page is to document how mol2db works and any changes made to mol2db, and is not necessarily a guide to preparing files to run or running them through mol2db.
The purpose of this page is to document how mol2db works and any changes made to mol2db, and is not necessarily a guide to preparing files to run or running them through mol2db.
NOTE: the requirement for 6 line headers of the input mol2 files is gone. asl.pl/etc are no longer necessary. Any number of lines is fine.


ui.c reads the inhier parameter file and sets variables
ui.c reads the inhier parameter file and sets variables
The proximal function in other.c figures out what coordinates are essentially the same. It uses a distance tolerance which is in thousandths of angstroms. Currently 7 is used, which amounts to 0.007Å differences in coordinates being called the same, the coordinates coming out of OMEGA do vary this much sometimes even for atoms in the same position.


hiergen.c uses the multiple conformations to construct the hierarchy data structure by the following algorithm:
hiergen.c uses the multiple conformations to construct the hierarchy data structure by the following algorithm:
# find the first atom in the rigid component as the first atom with only 1 unique position
# find the largest connected set of atoms that have only one unique position, call one of those atoms the beginning of the rigid component
# expand through bonds to any atoms that have the same or fewer unique positions. bonded atoms with more unique positions are recursively added as branches
# expand through bonds to any atoms that have the same or fewer unique positions. bonded atoms with more unique positions are recursively added as branches
# for each branch, repeat the examining of bonded atoms and examine unique positions
# for each branch, repeat the examining of bonded atoms and examine unique positions
Line 12: Line 16:
this would appear to have flaws if the atom positions downstream overlap, especially since a distance tolerance is used to consider things overlapped.
this would appear to have flaws if the atom positions downstream overlap, especially since a distance tolerance is used to consider things overlapped.


confhier.c puts the conformations into the hierarchy. the potential problem here is that only one atom in each conformation in used to determine if the conformation differs from a previous one. when there are some atoms in a conformation that are the same while others are different, the results will be random based on atom ordering in the mol2 files, and can be incorrect.
confhier.c puts the conformations into the hierarchy. the potential problem here is that only one atom in each conformation in used to determine if the conformation differs from a previous one. when there are some atoms in a conformation that are the same while others are different, the results will be random based on atom ordering in the mol2 files, and can be incorrect. This has been fixed so now each conformation when added checks against all atoms in the level, not just 1.
 
[[Category:Software]]
[[Category:Formats]]

Latest revision as of 23:21, 4 January 2019

mol2db is a program developed to combine multimol2 files into the Flexibase Format read by DOCK 3.6. A standalone tool to decode the .db files produced is db2multipdb.py.

The purpose of this page is to document how mol2db works and any changes made to mol2db, and is not necessarily a guide to preparing files to run or running them through mol2db.

NOTE: the requirement for 6 line headers of the input mol2 files is gone. asl.pl/etc are no longer necessary. Any number of lines is fine.

ui.c reads the inhier parameter file and sets variables

The proximal function in other.c figures out what coordinates are essentially the same. It uses a distance tolerance which is in thousandths of angstroms. Currently 7 is used, which amounts to 0.007Å differences in coordinates being called the same, the coordinates coming out of OMEGA do vary this much sometimes even for atoms in the same position.

hiergen.c uses the multiple conformations to construct the hierarchy data structure by the following algorithm:

  1. find the largest connected set of atoms that have only one unique position, call one of those atoms the beginning of the rigid component
  2. expand through bonds to any atoms that have the same or fewer unique positions. bonded atoms with more unique positions are recursively added as branches
  3. for each branch, repeat the examining of bonded atoms and examine unique positions

this would appear to have flaws if the atom positions downstream overlap, especially since a distance tolerance is used to consider things overlapped.

confhier.c puts the conformations into the hierarchy. the potential problem here is that only one atom in each conformation in used to determine if the conformation differs from a previous one. when there are some atoms in a conformation that are the same while others are different, the results will be random based on atom ordering in the mol2 files, and can be incorrect. This has been fixed so now each conformation when added checks against all atoms in the level, not just 1.