http://wiki.docking.org/api.php?action=feedcontributions&user=Teague+Sterling&feedformat=atomDISI - User contributions [en]2024-03-29T10:17:37ZUser contributionsMediaWiki 1.39.1http://wiki.docking.org/index.php?title=SGE_notes&diff=9386SGE notes2016-05-16T22:02:55Z<p>Teague Sterling: </p>
<hr />
<div>ALL ABOUT SGE (SUN GRID ENGINE)<br />
<br />
obviously this needs to be edited....<br />
'''domain''' must be replaced by the domain throughout...<br />
<br />
<pre><br />
To add an exec node:<br />
yum -y install gridengine gridengine-execd<br />
export SGE_ROOT=/usr/share/gridengine<br />
export SGE_CELL=bkslab<br />
cp -v /nfs/init/gridengine/install.conf /tmp/gridengine-install.conf<br />
vim /tmp/gridengine-install.conf -> CHANGE EXEC_HOST_LIST=" " TO EXEC_HOST_LIST="$HOSTNAME"<br />
cd /usr/share/gridengine/<br />
./inst_sge -x -s -auto /tmp/gridengine-install.conf > /tmp/gridengine.log<br />
cat /tmp/gridengine.log | tee -a /root/gridengine-install.log<br />
if [ -e ${SGE_CELL} ]; then mv -v ${SGE_CELL} ${SGE_CELL}.local; fi<br />
ln -vs /nfs/gridengine/${SGE_CELL} /usr/share/gridengine/${SGE_CELL}<br />
rm -vf /etc/sysconfig/gridengine<br />
echo "SGE_ROOT=${SGE_ROOT}" >> /etc/sysconfig/gridengine<br />
echo "SGE_CELL=${SGE_CELL}" >> /etc/sysconfig/gridengine<br />
mkdir -pv /var/spool/gridengine/`hostname -s`<br />
chown -Rv sgeadmin:sgeadmin /var/spool/gridengine<br />
chkconfig --levels=345 sge_execd on<br />
<br />
Go to sgemaster and do this:<br />
qconf -ae --> CHANGE THE HOSTNAME FROM "template" to hostname_of_new_exec<br />
qconf -as hostname<br />
<br />
HOW TO EDIT THE NUMBER OF SLOTS FOR A EXEC_HOST:<br />
qconf -mattr exechost complex_values slots=32 raiders.c.domain<br />
"complex_values" of "exechost" is empty - Adding new element(s).<br />
<br />
root@pan.slot-27.rack-1.pharmacy.cluster.domain modified "raiders.c.domain" in exechost list<br />
<br />
HOW TO ADD A HOSTGROUP:<br />
qconf -ahgrp @custom <br />
<br />
ADD THE EXECHOST TO A HOSTGROUP:<br />
qconf -mhgrp @custom<br />
<br />
service sgemaster restart<br />
<br />
Then back on the exec_host:<br />
<br />
service sge_execd start<br />
<br />
<br />
To suspend jobs you do:<br />
<br />
qmod -sj job_number<br />
<br />
To delete nodes I did the following:<br />
<br />
qconf -shgrpl -> To see a list of host groups<br />
qconf -shgrp @HOST_GROUP_NAME -> For each host group to see if the nodes you want to delete are listed<br />
If it is listed then:<br />
qconf-mhgrp @HOST_GROUP_NAME -> Modify this file (delete the line with the node you want to delete).<br />
Once you've deleted the node you want to delete from all the hostgroups:<br />
qconf -de node_you_want _to_delete >/dev/null<br />
qmod -de node_you_want _to_delete<br />
<br />
<br />
A more formal note removal pipeline (as BASH):<br />
<br />
for HG in $( qconf -shgrpl ) ; do<br />
qconf -dattr hostgrop hostlist NODE_NAME_HERE $HG<br />
done<br />
qconf -purge queue slots *.q@NODE_NAME_HERE (or all.q)<br />
qconf -ds NODE_NAME_HERE<br />
qconf -dconf NODE_NAME_HERE<br />
qconf -de NODE_NAME_HERE<br />
<br />
To alter the priority on all the jobs for a user:<br />
qstat -u user | cut -d ' ' -f2 >> some_file<br />
Edit some_file and delete the first couple lines (the header lines)<br />
for OUTPUT in $`cat some_file`; do qalter -p 1022 $OUTPUT; done;<br />
Priorities are -1024 to 1023<br />
<br />
DEBUGGING SGE:<br />
<br />
qstat -explain a<br />
<br />
for HOSTGROUP in `qconf -shgrpl`; do for HOSTLIST in `qconf -shgrp $HOSTGROUP`; do echo $HOSTLIST; done; done | grep node-1.slot-27.rack-2.pharmacy.cluster.domain<br />
<br />
Look at the logs for both master and exec <br />
(raiders:/var/spool/gridengine/raiders/messages and pan:/var/spool/gridengine/bkslab/qmaster/messages)<br />
<br />
Make sure resolv.conf looks like this:<br />
nameserver 142.150.250.10<br />
nameserver 10.10.16.64<br />
search cluster.domain domain bkslab.org <br />
<br />
[root@pan ~]# for X in $`qconf -shgrpl`; do qconf -shgrp $X; done;<br />
Host group "$@24-core" does not exist<br />
group_name @64-core<br />
hostlist node-26.rack-2.pharmacy.cluster.domain<br />
group_name @8-core<br />
hostlist node-2.slot-27.rack-1.pharmacy.cluster.domain \<br />
node-1.slot-27.rack-1.pharmacy.cluster.domain<br />
group_name @allhosts<br />
hostlist @physical @virtual<br />
group_name @physical<br />
hostlist node-26.rack-2.pharmacy.cluster.domain<br />
group_name @virtual<br />
hostlist node-2.slot-27.rack-1.pharmacy.cluster.domain \<br />
node-1.slot-27.rack-1.pharmacy.cluster.domain<br />
<br />
1) In one screen I would type strace qstat -f and then in the other screen I would type ps -ax | grep qstat to get the pid. Then ls -l /proc/pid/fd/<br />
I did this because when I typed strace qstat -f everytime it would get stuck saying this:<br />
poll([{fd=3, events=POLLIN|POLLPRI}], 1, 1000) = 0 (Timeout)<br />
gettimeofday({1390262563, 742705}, NULL) = 0<br />
gettimeofday({1390262563, 742741}, NULL) = 0<br />
gettimeofday({1390262563, 742771}, NULL) = 0<br />
gettimeofday({1390262563, 742801}, NULL) = 0<br />
gettimeofday({1390262563, 742828}, NULL) = 0<br />
gettimeofday({1390262563, 742855}, NULL) = 0<br />
gettimeofday({1390262563, 742881}, NULL) = 0<br />
gettimeofday({1390262563, 742909}, NULL) = 0<br />
<br />
and then eventually it would say this:<br />
poll([{fd=3, events=POLLIN|POLLPRI}], 1, 1000) = 1 ([{fd=3, revents=POLLIN}])<br />
gettimeofday({1390262563, 960292}, NULL) = 0<br />
gettimeofday({1390262563, 960321}, NULL) = 0<br />
gettimeofday({1390262563, 960349}, NULL) = 0<br />
read(3, "<gmsh><dl>99</dl></gms", 22) = 22<br />
read(3, "h", 1) = 1<br />
read(3, ">", 1) = 1<br />
read(3, "<mih version=\"0.1\"><mid>2</mid><"..., 99) = 99<br />
read(3, "<ccrm version=\"0.1\"></ccrm>", 27) = 27<br />
gettimeofday({1390262563, 960547}, NULL) = 0<br />
gettimeofday({1390262563, 960681}, NULL) = 0<br />
gettimeofday({1390262563, 960709}, NULL) = 0<br />
gettimeofday({1390262563, 960741}, NULL) = 0<br />
gettimeofday({1390262563, 960769}, NULL) = 0<br />
gettimeofday({1390262563, 960797}, NULL) = 0<br />
gettimeofday({1390262563, 960823}, NULL) = 0<br />
shutdown(3, 2 /* send and receive */) = 0<br />
close(3) = 0<br />
gettimeofday({1390262563, 961009}, NULL) = 0<br />
gettimeofday({1390262563, 961036}, NULL) = 0<br />
gettimeofday({1390262563, 961064}, NULL) = 0<br />
gettimeofday({1390262563, 961093}, NULL) = 0<br />
gettimeofday({1390262563, 961120}, NULL) = 0<br />
gettimeofday({1390262563, 961148}, NULL) = 0<br />
<br />
The thing that is wierd about this is when I typed ls -l /proc/pid/fd/ there was never a file descriptor "3"<br />
<br />
2) I tried to delete the nodes that we moved to SF by doing the following:<br />
qconf -dattr @physical "node-1.rack-3.pharmacy.cluster.domain node-10.rack-3.pharmacy.cluster.domain node-11.rack-3.pharmacy.cluster.domain node-12.rack-3.pharmacy.cluster.domain node-13.rack-3.pharmacy.cluster.domain node-14.rack-3.pharmacy.cluster.domain node-15.rack-3.pharmacy.cluster.domain node-2.rack-3.pharmacy.cluster.domain node-26.rack-3.pharmacy.cluster.domain node-27.rack-3.pharmacy.cluster.domain node-29.rack-3.pharmacy.cluster.domain node-3.rack-3.pharmacy.cluster.domain node-4.rack-3.pharmacy.cluster.domain node-5.rack-3.pharmacy.cluster.domain node-6.rack-3.pharmacy.cluster.domain node-7.rack-3.pharmacy.cluster.domain node-8.rack-3.pharmacy.cluster.domain node-9.rack-3.pharmacy.cluster.domain" node-1.rack-3.pharmacy.cluster.domain @physical > /dev/null<br />
<br />
I would get the error: Modification of object "@physical" not supported<br />
<br />
3) I tried to see the queues complex attributes by typing qconf -sc and saw this:<br />
<br />
#name shortcut type relop requestable consumable default urgency <br />
<br />
slots s INT <= YES YES 1 1000<br />
<br />
I am not quite sure what urgency = 1000 means.<br />
All other names had "0" under urgency.<br />
<br />
4) I tried qmod -cq '*' to clear the error state of all the queues. <br />
It would tell me this:<br />
<br />
Queue instance "all.q@node-1.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-1.slot-27.rack-1.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-1.slot-27.rack-2.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-10.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-11.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-12.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-13.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-14.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-15.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-2.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-2.slot-27.rack-1.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-2.slot-27.rack-2.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-26.rack-2.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-26.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-27.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-29.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-3.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-3.slot-27.rack-2.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-4.rack-3.pharmacy.cluster.domain is already in the specified state: no error<br />
Queue instance "all.q@node-4.slot-27.rack-2.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-5.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-5.slot-27.rack-2.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-6.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-6.slot-27.rack-2.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-7.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-7.slot-27.rack-2.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-8.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
Queue instance "all.q@node-9.rack-3.pharmacy.cluster.domain" is already in the specified state: no error<br />
<br />
<br />
5) I tried deleting a node like this instead:<br />
qconf -ds node-1.rack-3.pharmacy.cluster.domain<br />
But when I typed qconf -sel it was still there.<br />
<br />
6) I tried to see what the hostlist for @physical was by typing qconf -ahgrp @physical. It said: group_name @physical, hostlist NONE<br />
Then I typed qconf -shgrpl to see a list of all hostgroups and tried typing qconf -ahgrp. All of them said the hostlist was NONE, <br />
but when I tried to type qconf -ahgrp @allhosts I got this message:<br />
denied: "root" must be manager for this operation<br />
error: commlib error: got select error (Connection reset by peer)<br />
<br />
7) I looked at the messages in the file: /var/spool/gridengine/bkslab/qmaster/messages and it said this (over and over again):<br />
<br />
01/20/2014 19:41:35|listen|pan|E|commlib error: got read error (closing "pan.slot-27.rack-1.pharmacy.cluster.domain/qconf/2")<br />
01/20/2014 19:43:24| main|pan|W|local configuration pan.slot-27.rack-1.pharmacy.cluster.domain not defined - using global configuration<br />
01/20/2014 19:43:24| main|pan|W|can't resolve host name "node-3-3.rack-3.pharmacy.cluster.domain": undefined commlib error code<br />
01/20/2014 19:43:24| main|pan|W|can't resolve host name "node-3-4.rack-3.pharmacy.cluster.domain": undefined commlib error code<br />
01/20/2014 19:43:53| main|pan|I|read job database with 468604 entries in 29 seconds<br />
01/20/2014 19:43:55| main|pan|I|qmaster hard descriptor limit is set to 8192<br />
01/20/2014 19:43:55| main|pan|I|qmaster soft descriptor limit is set to 8192<br />
01/20/2014 19:43:55| main|pan|I|qmaster will use max. 8172 file descriptors for communication<br />
01/20/2014 19:43:55| main|pan|I|qmaster will accept max. 99 dynamic event clients<br />
01/20/2014 19:43:55| main|pan|I|starting up GE 6.2u5p3 (lx26-amd64)<br />
<br />
8) Periodically i would get this error: ERROR: failed receiving gdi request response for mid=3 (got no message).<br />
<br />
9) I also tried delete the pid in the file: /var/spool/gridengine/bkslab/qmaster/qmaster.pid<br />
That didn't do anything. It eventually just replaced it with a different number. <br />
<br />
It's wierd because it's not even the right pid. For example the real pid was 8286 and the pid in the file was 8203:<br />
<br />
[root@pan qmaster]# service sgemaster start<br />
Starting sge_qmaster: [ OK ]<br />
[root@pan qmaster]# ps -ax |grep sge<br />
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ<br />
8286 ? Rl 0:03 /usr/bin/sge_qmaster<br />
8301 pts/0 S+ 0:00 grep sge<br />
[root@pan qmaster]# cat qmaster.pid <br />
8203<br />
<br />
10) When I typed tail /var/log/messages I saw this:<br />
<br />
Jan 20 14:25:05 pan puppet-agent[2021]: Could not request certificate: Connection refused - connect(2)<br />
Jan 20 14:27:05 pan puppet-agent[2021]: Could not request certificate: Connection refused - connect(2)<br />
Jan 20 14:29:05 pan puppet-agent[2021]: Could not request certificate: Connection refused - connect(2)<br />
Jan 20 14:31:05 pan puppet-agent[2021]: Could not request certificate: Connection refused - connect(2)<br />
Jan 20 14:33:06 pan puppet-agent[2021]: Could not request certificate: Connection refused - connect(2)<br />
Jan 20 14:35:06 pan puppet-agent[2021]: Could not request certificate: Connection refused - connect(2)<br />
Jan 20 14:36:29 pan kernel: Registering the id_resolver key type<br />
Jan 20 14:36:29 pan kernel: FS-Cache: Netfs 'nfs' registered for caching<br />
Jan 20 14:36:29 pan nfsidmap[2536]: nss_getpwnam: name 'root@rack-1.pharmacy.cluster.domain' does not map into domain 'domain'<br />
Jan 20 14:37:06 pan puppet-agent[2021]: Could not request certificate: Connection refused - connect(2)<br />
This was what happened when I restarted the machine.<br />
<br />
</pre><br />
<br />
[[Category:Sysadmin]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=DB2_File_Format&diff=9369DB2 File Format2016-04-19T21:38:33Z<p>Teague Sterling: Created page with "This page explains the DB2 file format used in DOCK37. = Nomenclature Definitions = * Conf - one set of atoms that moves together with a single position per atom. * Set - a ..."</p>
<hr />
<div>This page explains the DB2 file format used in DOCK37.<br />
<br />
= Nomenclature Definitions =<br />
<br />
* Conf - one set of atoms that moves together with a single position per atom.<br />
* Set - a group of conformations that completely defines one position for each atom in a ligand.<br />
* Cluster - Not yet implamented in DOCK3.7<br />
* Cloud - Not yet implamented in DOCK3.7<br />
<br />
= Record Types =<br />
*T type information (implicitly assumed)<br />
*M molecule (4 lines req'd, after that they are optional, 24 lines max)<br />
*A atoms<br />
*B bond<br />
*X xyz <br />
*R rigid xyz for matching (can actually be any xyzs) <br />
*C conformation<br />
*S sets<br />
*D clusters<br />
*E end of molecule<br />
<br />
T ## namexxxx (implicitly assumed to be the standard 7)<br />
M zincname protname #atoms #bonds #xyz #confs #sets #rigid #Mlines #clusters<br />
M charge polar_solv apolar_solv total_solv surface_area<br />
M smiles<br />
M longname<br />
[M arbitrary information preserved for writing out]<br />
A stuff about each atom, 1 per line <br />
B stuff about each bond, 1 per line<br />
X coordnum atomnum confnum x y z <br />
R rigidnum color x y z<br />
C confnum coordstart coordend<br />
S setnum #lines #confs_total broken hydrogens omega_energy<br />
S setnum linenum #confs confs [until full column]<br />
D clusternum setstart setend matchstart matchend #additionalmatching<br />
D matchnum color x y z<br />
E <br />
<br />
With the above descriptions, here is a description of the columns that are used. Format statements for python/fortran will also appear at some point. If speed/size becomes an issue this might get replaced with a binary file format.<br />
<br />
notes: 17 children groups/group per line in current scheme.<br />
9 children confs/group per line.<br />
9 children confs/conf per line.<br />
8 confs/set per line.<br />
groups/confs with no children are written out.<br />
<br />
on the atom line, dt is dock type and co is color.<br />
<br />
1 2 3 4 5 6 7<br />
01234567890123456789012345678901234567890123456789012345678901234567890123456789<br />
T ## typename<br />
M ZINCCODEXXXXXXXX PROTCODEX ATO BON XYZXXX CONFSX SETSXX RIGIDX MLINES NUMCLU<br />
M +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
M SMILESXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
M LONGNAMEXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX<br />
[M ARBITRARY_INFORMATION_PRESERVEDXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]<br />
A NUM NAME TYPEX DT CO +CHA.RGEX +POLAR.SOL +APOLA.SOL +TOTAL.SOL SURFA.REA<br />
B NUM ATO ATO TY<br />
X COORDNUMX ATO CONFNU +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
R NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
C CONFNO COORDSTAR COORDENDX<br />
S SETIDX #LINES #CO C H +ENERGY.XXX<br />
S SETIDX LINENO # CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS CCONFS<br />
D CLUSID STASET ENDSET MST MEN ADD<br />
D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
E<br />
<br />
the type lines following are assumed by dock unless overriden:<br />
T 1 positive<br />
T 2 negative<br />
T 3 acceptor<br />
T 4 donor<br />
T 5 ester_o<br />
T 6 amide_o<br />
T 7 neutral<br />
<br />
the following are the format statements for python for each line<br />
T %2d %8s\n<br />
M %16s %9s %3d %3d %6d %6d %6d %6d &6d %6d\n<br />
M %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
M %77s\n<br />
M %77s\n<br />
M %77s\n<br />
A %3d %-4s %-5s %2d %2d %+9.4f %+10.3f %+10.3f %+10.3f %9.3f\n<br />
B %3d %3d %3d %-2s\n<br />
X %9d %3d %6d %+9.4f %+9.4f %+9.4f\n<br />
R %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
C %6d %9d %9d\n<br />
S %6d %6d %3d %1d %1d %+11.3f\n<br />
S %6d %6d %1d %6d %6d %6d %6d %6d %6d %6d %6d\n <br />
D %6d %6d %6d %3d %3d %3d\n<br />
D %3d %2d %+9.4f %+9.4f %+9.4f\n<br />
E\n<br />
<br />
The following are the fortran77 format statements<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
1000 format(2x,i2,1x,a8)<br />
!M zincname protname #atoms #bonds #xyz #groups #confs #sets #rigid #mlines #clusters<br />
2000 format(2x,a16,1x,a9,1x,i3,1x,i3,1x,i6,1x,i6,1x,i6,x,i6,x,i6,x,i6,x,i6)<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
2100 format(2x,f9.4,1x,f10.3,1x,f10.3,1x,f10.3,1x,f9.3)<br />
!M smiles or longname<br />
2200 format(2x,a77)<br />
!A stuff about each atom, 1 per line<br />
3000 format(2x,i3,1x,a4,1x,a5,1x,i2,1x,i2,1x,f9.4,1x,f10.3,1x,<br />
& f10.3,1x,f10.3,1x,f9.3)<br />
!B stuff about each bond, 1 per line<br />
4000 format(2x,i3,1x,i3,1x,i3,1x,a2)<br />
!X atomnum confnum x y z<br />
5000 format(2x,i9,1x,i3,1x,i6,x,f9.4,1x,f9.4,1x,f9.4)<br />
!R rigidnum color x y z<br />
6000 format(2x,i3,x,i2,x,f9.4,1x,f9.4,1x,f9.4)<br />
!C confnum #startcoord #endcoord<br />
7000 format(2x,i6,1x,i9,1x,i9)<br />
!S setnum #lines #confs_total broken hydrogens omega_energy<br />
8000 format(2x,i6,1x,i6,1x,i3,1x,i1,1x,i1,1x,f11.3)<br />
!S setnum linenum #confs confs [until full column]<br />
8100 format(2x,i6,1x,i6,1x,i1,1x,i6,1x,i6,1x,i6,1x,i6,<br />
& 1x,i6,1x,i6,1x,i6,1x,i6)<br />
!D CLUSID STARTSETX ENDSETXXX ADD MST MEN<br />
9000 format(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)<br />
!D NUM CO +XCO.ORDX +YCO.ORDX +ZCO.ORDX<br />
!re-use 6000<br />
!E<br />
!E does not get a format line<br />
<br />
The following are Fortran95 format statements:<br />
<br />
!T ## namexxxx (implicitly assumed to be the standard 7)<br />
character (len=*), parameter :: DB2NAME = '(2x,i2,x,a8)' !1000<br />
!M zincname protname #atoms #bonds #xyz #confs #sets #rigid #maxmlines #clusters<br />
character (len=*), parameter :: DB2M1 =<br />
& '(2x,a16,x,a9,x,i3,x,i3,x,i6,x,i6,x,i6,x,i6,x,i6,x,i6)' !2000<br />
!M charge polar_solv apolar_solv total_solv surface_area<br />
character (len=*), parameter :: DB2M2 =<br />
& '(2x,f9.4,x,f10.3,x,f10.3,x,f10.3,x,f9.3)' !2100<br />
!M smiles/longname/arbitrary<br />
character (len=*), parameter :: DB2M3 = '(2x,a78)' !2200<br />
!A stuff about each atom, 1 per line<br />
character (len=*), parameter :: DB2ATOM =<br />
& '(2x,i3,x,a4,x,a5,x,i2,x,i2,x,f9.4,x,f10.3,x,<br />
& f10.3,x,f10.3,x,f9.3)' !3000<br />
!B stuff about each bond, 1 per line<br />
character (len=*), parameter :: DB2BOND =<br />
& '(2x,i3,x,i3,x,i3,x,a2)' !4000<br />
!X coordnumx atomnum confnum x y z<br />
character (len=*), parameter :: DB2COORD =<br />
& '(2x,i9,x,i3,x,i6,x,f9.4,x,f9.4,x,f9.4)' !5000<br />
!R rigidnum color x y z<br />
character (len=*), parameter :: DB2RIGID =<br />
& '(2x,i6,x,i2,x,f9.4,x,f9.4,x,f9.4)' !6000<br />
!C confnum coordstart coordend<br />
character (len=*), parameter :: DB2CONF = '(2x,i6,x,i9,x,i9)' !7000<br />
!S setnum #lines #confs_total broken hydrogens omega_energy <br />
character (len=*), parameter :: DB2SET1 =<br />
& '(2x,i6,x,i6,x,i3,x,i1,x,i1,x,f11.3)' !8000<br />
!S setnum linenum #confs confs [until full column]<br />
character (len=*), parameter :: DB2SET2 =<br />
& '(2x,i6,x,i6,x,i1,x,i6,x,i6,x,i6,x,i6,<br />
& 1x,i6,x,i6,x,i6,x,i6)' !8100<br />
!D CLUSID STASET ENDSET ADD(ittional matching spheres count) MST(art) MEN(d)<br />
character (len=*), parameter :: DB2CLUSTER =<br />
& '(2x,i6,x,i6,x,i6,x,i3,x,i3,x,i3)' !9000<br />
!D NUM CO x y z<br />
!reuse DB2RIGID<br />
!E<br />
!E does not get a format line <br />
<br />
[[Category:Formats]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Cluster_2&diff=9348Cluster 22016-03-30T18:24:55Z<p>Teague Sterling: /* Hardware and physical location */</p>
<hr />
<div>This is the default lab cluster.<br />
<br />
{{TOCright}}<br />
<br />
= Priorities and Policies = <br />
* [[Lab Security Policy]]<br />
* [[Disk space policy]]<br />
* [[Backups]] policy.<br />
* [[Portal system]] for off-site ssh cluster access.<br />
* Get a [[Cluster 2 account]] and get started<br />
<br />
= Special machines = <br />
Normally, you will just ssh to sgehead aka gimel from portal.ucsf.bkslab.org where you can do almost anything, including job management. A few things require licensing and must be done on special machines. <br />
<br />
* psi for using the PG fortran compiler<br />
* ppilot is at http://zeta:9944/ - you must be on the Cluster 2 private network to use it<br />
* no other special machines<br />
<br />
= Notes = <br />
* to get from SVN, use svn ssh+svn<br />
<br />
= Hardware and physical location =<br />
* 1232 cpu-cores for queued jobs<br />
* 128 cpu-cores for infrastructure, databases, management and ad hoc jobs.<br />
* 128 TB of high quality NFS-available disk<br />
* 32 TB of other disk<br />
* We expect this to grow to over 1500 cpu-cores and 200 TB in late 2016 once Cluster 0 is merged with Cluster 2 <br />
* Our policy is to have 4 GB RAM per cpu-core unless otherwise specified.<br />
* Machines older than 3 years may have 2GB/core and 6 years old have 1GB/core.<br />
* Cluster 2 is currently stored entirely in Rack 0 which is in Row 0, Position 4 of BH101 at 1700 4th St (Byers Hall).<br />
* '''More racks will be added (from cluster 0) in summer 2016.'''<br />
* Central services are on aleph, an HP DL160G5 and bet, an HP xxxx. <br />
* CPU<br />
** 3 Silicon Mechanics Rackform nServ A4412.v4 s, each comprising 4 computers of 32 cpu-cores for a total of 384 cpu-cores.<br />
** 1 Dell C6145 with 128 cores.<br />
** An HP DL165G7 (24-way) is sgehead<br />
** more computers to come from Cluster 0, when Cluster 2 is fully ready.<br />
* DISK<br />
** HP disks - 40 TB RAID6 SAS (new in 2014)<br />
** Silicon Mechanics NAS - new in 2014 - 77 TB RAID6 SAS (new in 2014)<br />
** A HP DL160G5 and an MSA60 with 12 TB SAS (disks new in 2014)<br />
<br />
= Naming convention<br />
* The Hebrew alphabet is used for physical machines<br />
* Greek letters for VMs.<br />
* Functions (e.g. sgehead) are aliases (CNAMEs).<br />
* compbio.ucsf.edu and ucsf.bkslab.org domains both supported.<br />
<br />
= Disk organization = <br />
* shin aka nas1 mounted as /nfs/db/ = 72 TB SAS RAID6<br />
* bet aka happy, internal: /nfs/store and psql (temp) as 10 TB SATA RAID10<br />
* elated on happy: /nfs/work only as 36 TB SAS RAID6<br />
* het (43) aka former vmware2 MSA 60 exports /nfs/home and /nfs/soft<br />
<br />
= Special purpose machines - all .ucsf.bkslab.org = <br />
<br />
* sgehead aka gimel.cluster - nearly the only machine you'll need. <br />
* psi.cluster - PG fortran compiler (if it only has a .cluster address means it has no public address)<br />
* portal aka epsilon - secure access<br />
* zeta.cluster - Pipeline Pilot<br />
* shin, bet, and dalet are the three NFS servers. You should not need to log in to them.<br />
* mysql1.cluster - general purpose mysql server (like former scratch)<br />
* pg1.cluster - general purpose postgres server <br />
* fprint.cluster - fingerprinting server<br />
<br />
[[About our cluster]]<br />
<br />
[[Category:Cluster]]<br />
[[Category:Internal]]<br />
[[Category:UCSF]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Set_up_a_database_server&diff=9318Set up a database server2016-03-09T22:03:41Z<p>Teague Sterling: Adding replication instructions</p>
<hr />
<div>This page describes how to set up a database server. <br />
<br />
We will set up a psql server for ZINC and Drupal and a MySQL server for DOCK Blaster / SEA.<br />
<br />
<br />
= Replication =<br />
Note: Trigger files and restore logs could (should) be on a shared file system<br />
<br />
VERSION=9.2<br />
MASTER=10.1.2.3<br />
SLAVE=10.1.2.4<br />
USER=replication<br />
PASS=<br />
BASE=/var/lib/pgsql/$VERSION<br />
ARCHIVE=/srv/archive/pgsql/$VERSION<br />
REPLOG=$ARCHIVE/wal/<br />
<br />
== Command Order ==<br />
<br />
=== 1) On Master ===<br />
<br />
pgsql "SELECT pg_start_backup('copy-to-$SLAVE', true);"<br />
rsync -av --exclude postmaster.pid --exclude postgresql.conf $BASE root@$SLAVE:$BASE<br />
rsync -av $REPLOG $SLAVE:$REPLOG<br />
<br />
=== 2) On Slave ===<br />
<br />
Make sure $BASE/recovery.config exists (below)<br />
<br />
service postgresql-$VERSION start<br />
<br />
=== 3) On Master ===<br />
<br />
pgsql -c "SELECT pg_stop_backup();"<br />
pgsql -c "SELECT pg_current_xlog_location();"<br />
<br />
=== 4) On Slave ===<br />
<br />
pgsql -c "SELECT pg_last_xlog_receive_location();"<br />
pgsql -c "SELECT pg_last_xlog_replay_location();"<br />
<br />
== Configuration File Requirements ==<br />
<br />
=== $MASTER:$BASE/postgresql.conf ===<br />
<br />
max_wal_senders = 5<br />
wal_keep_segments = 32<br />
archive_mode = on<br />
archive_command = 'cp %p $REPLOG/%f'<br />
wal_level = hot_standby<br />
<br />
=== $SLAVE:$BASE/postgresql.conf ===<br />
<br />
wal_level = hot_standby<br />
max_wal_senders = 5<br />
wal_keep_segments = 32<br />
archive_mode = off<br />
hot_standby = on<br />
<br />
=== $SLAVE:$BASE/recover.conf ===<br />
<br />
standby_mode = 'on'<br />
primary_conninfo = 'host=$MASTER port=5432 user=$USER password=$PASS'<br />
trigger_file = '$BASE.trigger'<br />
restore_command = 'cp $REPLOG/%f %p'<br />
<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Databases]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Retrosynthetic_analysis&diff=9268Retrosynthetic analysis2016-01-07T18:22:54Z<p>Teague Sterling: /* run program */</p>
<hr />
<div>== set up environment == <br />
<br />
source /nfs/soft/dock/DOCK37/env.csh<br />
source /nfs/soft/www/apps/zinc15/envs/internal/env.csh<br />
<br />
== run program == <br />
zinc-manage -e shell retrosynth --smiles="Cc3ccn2nc(CCCC(=O)Nc1ccc(O)cc1C)cc2c3" --recursions=0 --show=successful<br />
<br />
=== sample output and explanation ===<br />
<pre><br />
Line-1 (1 options)<br />
Successful Synthesis via negishi coupling in 1 steps<br />
Cc1ccn2nc(CCl)cc2c1 . Cc1cc(O)ccc1NC(=O)CCCl >> Cc1ccn2nc(CCCC(=O)Nc3ccc(O)cc3C)cc2c1<br />
Reactant 1: Cc1ccn2nc(CCl)cc2c1 - Purchasable as ZINC000238540793<br />
Reactant 2: Cc1cc(O)ccc1NC(=O)CCCl - Purchasable as ZINC000041725839<br />
</pre><br />
<br />
If you want a bit more diversity you could change --show to partial to get reactions where less than all (but more than none) of the reactants were available. This also may be useful if you want to use a different reaction. You can the either run similarity searches around the unresolved reactants or try to increase the --recursions to try a few more steps... That can get out of hand above 2 or 3 though (not that you'd want to do that).<br />
<br />
== step 2 - Gets ZINC IDS == <br />
* 1) ZINC000238540793<br />
* 2) ZINC000041725839<br />
<br />
I then ran a similarity search for building block compounds containing the reactive group for the left and right side of a Negishi coupling (no easy interface available for this yet)<br />
<br />
These are left.smi and right.sim in /nfs/work/teague/MK/<br />
<br />
These current results are at 70% Tanimoto because I misplaced the 50% Tanimoto downloads (doh!)<br />
<br />
== step 3 ==<br />
I then ran the reactor in DOCK37 ($DOCKBASE/ligand/reactor/react.py) with the RXN SMARTS for the Negishi coupling ( "[#6:1]-,:[#17,#35,#53].[#6:2]-[#17,#35,#53]>>[#6:1]-[#6:2]" )<br />
and the left.smi and right.smi and filtered for duplicates. This creates synthetic.smi. Modify that to be better suited for db2 generation.<br />
<br />
(all in the run.sh script in that directory).<br />
<br />
== step 4 ==<br />
<br />
Run database building using the DOCK37 pipeline.<br />
<br />
$DOCKBASE/ligand/generate/build_smiles_ligand.sh database.smi<br />
<br />
<br />
For the new compound you showed me, I think we need to investigate the reactions and see which one is failing to validate.<br />
<br />
I am building the db2 files now in that directory.<br />
<br />
<br />
[[Category:Tutorials]]<br />
[[Category:Synthesis]]<br />
[[Category:Organic chemistry]]<br />
[[Category:Reactions]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Retrosynthetic_analysis&diff=9267Retrosynthetic analysis2016-01-07T18:22:16Z<p>Teague Sterling: /* run program */</p>
<hr />
<div>== set up environment == <br />
<br />
source /nfs/soft/dock/DOCK37/env.csh<br />
source /nfs/soft/www/apps/zinc15/envs/internal/env.csh<br />
<br />
== run program == <br />
zinc-manage -e shell retrosynth --smiles="Cc3ccn2nc(CCCC(=O)Nc1ccc(O)cc1C)cc2c3" --recursions=0 --show=successful<br />
<br />
== sample output and explanation<br />
<pre><br />
Line-1 (1 options)<br />
Successful Synthesis via negishi coupling in 1 steps<br />
Cc1ccn2nc(CCl)cc2c1 . Cc1cc(O)ccc1NC(=O)CCCl >> Cc1ccn2nc(CCCC(=O)Nc3ccc(O)cc3C)cc2c1<br />
Reactant 1: Cc1ccn2nc(CCl)cc2c1 - Purchasable as ZINC000238540793<br />
Reactant 2: Cc1cc(O)ccc1NC(=O)CCCl - Purchasable as ZINC000041725839<br />
</pre><br />
<br />
If you want a bit more diversity you could change --show to partial to get reactions where less than all (but more than none) of the reactants were available. This also may be useful if you want to use a different reaction. You can the either run similarity searches around the unresolved reactants or try to increase the --recursions to try a few more steps... That can get out of hand above 2 or 3 though (not that you'd want to do that).<br />
<br />
== step 2 - Gets ZINC IDS == <br />
* 1) ZINC000238540793<br />
* 2) ZINC000041725839<br />
<br />
I then ran a similarity search for building block compounds containing the reactive group for the left and right side of a Negishi coupling (no easy interface available for this yet)<br />
<br />
These are left.smi and right.sim in /nfs/work/teague/MK/<br />
<br />
These current results are at 70% Tanimoto because I misplaced the 50% Tanimoto downloads (doh!)<br />
<br />
== step 3 ==<br />
I then ran the reactor in DOCK37 ($DOCKBASE/ligand/reactor/react.py) with the RXN SMARTS for the Negishi coupling ( "[#6:1]-,:[#17,#35,#53].[#6:2]-[#17,#35,#53]>>[#6:1]-[#6:2]" )<br />
and the left.smi and right.smi and filtered for duplicates. This creates synthetic.smi. Modify that to be better suited for db2 generation.<br />
<br />
(all in the run.sh script in that directory).<br />
<br />
== step 4 ==<br />
<br />
Run database building using the DOCK37 pipeline.<br />
<br />
$DOCKBASE/ligand/generate/build_smiles_ligand.sh database.smi<br />
<br />
<br />
For the new compound you showed me, I think we need to investigate the reactions and see which one is failing to validate.<br />
<br />
I am building the db2 files now in that directory.<br />
<br />
<br />
[[Category:Tutorials]]<br />
[[Category:Synthesis]]<br />
[[Category:Organic chemistry]]<br />
[[Category:Reactions]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Retrosynthetic_analysis&diff=9266Retrosynthetic analysis2016-01-07T18:21:50Z<p>Teague Sterling: /* set up environment */</p>
<hr />
<div>== set up environment == <br />
<br />
source /nfs/soft/dock/DOCK37/env.csh<br />
source /nfs/soft/www/apps/zinc15/envs/internal/env.csh<br />
<br />
== run program == <br />
zinc_manage retrosynth --smiles="Cc3ccn2nc(CCCC(=O)Nc1ccc(O)cc1C)cc2c3" --recursions=0 --show=successful<br />
<br />
== sample output and explanation<br />
<pre><br />
Line-1 (1 options)<br />
Successful Synthesis via negishi coupling in 1 steps<br />
Cc1ccn2nc(CCl)cc2c1 . Cc1cc(O)ccc1NC(=O)CCCl >> Cc1ccn2nc(CCCC(=O)Nc3ccc(O)cc3C)cc2c1<br />
Reactant 1: Cc1ccn2nc(CCl)cc2c1 - Purchasable as ZINC000238540793<br />
Reactant 2: Cc1cc(O)ccc1NC(=O)CCCl - Purchasable as ZINC000041725839<br />
</pre><br />
<br />
If you want a bit more diversity you could change --show to partial to get reactions where less than all (but more than none) of the reactants were available. This also may be useful if you want to use a different reaction. You can the either run similarity searches around the unresolved reactants or try to increase the --recursions to try a few more steps... That can get out of hand above 2 or 3 though (not that you'd want to do that).<br />
<br />
== step 2 - Gets ZINC IDS == <br />
* 1) ZINC000238540793<br />
* 2) ZINC000041725839<br />
<br />
I then ran a similarity search for building block compounds containing the reactive group for the left and right side of a Negishi coupling (no easy interface available for this yet)<br />
<br />
These are left.smi and right.sim in /nfs/work/teague/MK/<br />
<br />
These current results are at 70% Tanimoto because I misplaced the 50% Tanimoto downloads (doh!)<br />
<br />
== step 3 ==<br />
I then ran the reactor in DOCK37 ($DOCKBASE/ligand/reactor/react.py) with the RXN SMARTS for the Negishi coupling ( "[#6:1]-,:[#17,#35,#53].[#6:2]-[#17,#35,#53]>>[#6:1]-[#6:2]" )<br />
and the left.smi and right.smi and filtered for duplicates. This creates synthetic.smi. Modify that to be better suited for db2 generation.<br />
<br />
(all in the run.sh script in that directory).<br />
<br />
== step 4 ==<br />
<br />
Run database building using the DOCK37 pipeline.<br />
<br />
$DOCKBASE/ligand/generate/build_smiles_ligand.sh database.smi<br />
<br />
<br />
For the new compound you showed me, I think we need to investigate the reactions and see which one is failing to validate.<br />
<br />
I am building the db2 files now in that directory.<br />
<br />
<br />
[[Category:Tutorials]]<br />
[[Category:Synthesis]]<br />
[[Category:Organic chemistry]]<br />
[[Category:Reactions]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=ZINC15:Syntax&diff=9208ZINC15:Syntax2015-10-14T18:11:54Z<p>Teague Sterling: </p>
<hr />
<div>ZINC 15 uses a uniform set of rules to interpret the URL allowing both web pages and a machine-readable application programming interface (API). This page describes both.<br />
<br />
{{TOCright}}<br />
<br />
= Overview =<br />
ZINC15 queries for all resource types can be formulated as HTTP requests using a consistent URL syntax. The syntax is consistent for both interactive (browser-based) and API (script-based) queries. The API downloads are triggered by specifying a download format and, optionally, a list of output fields to include.<br />
<br />
All URLs will be provided relative to [//zinc15.docking.org zinc15.docking.org], thus /substances/ implies [http://zinc15.docking.org/substances/ zinc15.docking.org/substances/].<br />
<br />
= Syntax =<br />
<br />
The full syntax of a query is defined as:<br />
<pre style="overflow-x: scroll"><br />
/<RESOURCE>[/<IDENTIFIER>[/<RELATION>]][/subsets/<SUBSET>[+<SUBSET>...]][/having/[no-]<EXT_RELATION>[+[no-]<EXT_RELATION>...]][/subsets/[<EXT_RELATION>.]<RELATION_SUBSET>+[[<EXT_RELATION>.]<RELATION_SUBSET>...]](/[<VIEW>.html]|.<FORMAT>[:<OUTPUT_FIELD>[+<OUTPUT_FIELD>...]])[?<OPTION_NAME>=<OPTION_VALUE>|<QUERY_FIELD>[-<OPERATOR>[-<OPERATOR_ARG>...]]=<FIELD_CONSTRAINT>[&<OPTION_NAME>=<OPTION_VALUE>|<QUERY_FIELD>[-<OPERATOR>[-<OPERATOR_ARG>...]]=<FIELD_CONSTRAINT>...]]<br />
</pre><br />
<br />
More Concisely, this can be written as:<br />
<pre style="overflow-x: scroll"><br />
/<Root>[/<Subsets>][/<Existential>[/<Ext. Subsets>]]<Formatting>[?[<Constraints>][<Options>]]<br />
</pre><br />
<br />
Broken down into individual components:<br />
<br />
{| class="wikitable"<br />
|- <br />
! Syntax<br />
! Name<br />
! Explanation<br />
! Example<br />
! Notes<br />
|-<br />
| <RESOURCE>[/<IDENTIFIER>[/<RELATION>]]<br />
| ('''Root''') <br>Query Root<br />
| Defines where the query should be performed: either on the whole resource or only those related to another, single resource.<br />
| [//zinc15.docking.org/trials/NCT00001251/substances/ /trials/NCT00001251/substances]<br />
| Valid values for <IDENTIFIER> and <RELATION> parts of the query root are defined separate for each resource.<br />
|-<br />
| subsets/<SUBSET>[+<SUBSET>...]<br />
| ('''Subsets''') <br>Query Subsets<br />
| Applies one or more named, predefined constraints to the query.<br />
| [//zinc15.docking.org/substances/subsets/fda+for-sale/ .../subsets/fda+for-sale/]<br />
| Subsets are defined separately for each resource. Some subsets are disjoint, so it's possible to accidentally construct null queries<br />
|-<br />
| having/[no-]<EXT_RELATION>[+[no-]<EXT_RELATION>...]<br />
| ('''Existential''') <br>Existential Requirements <br />
| Applies existential (or non-existential with the "no-" prefix) constraints based on on the results of the query<br />
| [//zinc15.docking.org/genes/having/no-protomers/ .../having/no-protomers/]<br />
| The available relations depend on the resultant resource, which is not necessarily the same as the root resource<br />
|-<br />
| subsets/[<EXT_RELATION>.]<RELATION_SUBSET>+[[<EXT_RELATION>.]<RELATION_SUBSET>...]<br />
| ('''Ext. Subsets''') <br>Existential Relation Subsets <br />
| Applies subset requirements to the **existential relation requirement** (see above) of the query<br />
| [//zinc15.docking.org/substances/having/trials/subsets/cancer/ .../subsets/cancer/]<br />
| The relation subsets are completely independent of the query subsets, and valid values depend on the type of the existential resource. When multiple existential relations are specified, the subsets must be prefixed with "<EXT_RELATION>." to identify which they apply to.<br />
|- <br />
| /[<VIEW>.html] OR<br> .<FORMAT>[:<OUTPUT_FIELD>[+<OUTPUT_FIELD>...]]<br />
| ('''Formatting''') <br>Formatting Options <br />
| Dictates how the result of a query will be returned. If <FORMAT> is specified the result will be sent as a download. If a <VIEW> is specified the result will be rendered as HTML. If neither is provided (just "/") the "**Accept**" header will be interrogated to determine the requested format, which defaults to HTML.<br />
| [//zinc15.docking.org/majorclasses.json%3Apublic_identifier+name+num_genes+num_substances .json:public_identifier+name +num_genes+num_substances]<br />
| Valid views (currently) include table or tile. The default is always tile. Valid [[ZINC15:Formats]] are explained in the wiki. Valid <OUTPUT_FIELD>s depend on the specific resource being requested, and can be found on the resource help pages.<br />
|}<br />
<br />
<br />
== Query Root ==<br />
<br />
There are three basic types of queries supported by ZINC15.<br />
<br />
{| class="wikitable"<br />
|- <br />
! Kind <br />
! Syntax <br />
! Description <br />
! Example <br />
|- <br />
| Full Resource Listing <br />
| /<RESOURCE> <br />
| Performs the query on all <RESOURCE>s<br />
| [//zinc15.docking.org/genes/ /genes]<br />
|- <br />
| Single Resource Lookup <br />
| /<RESOURCE>/<IDENTIFIER><br />
| Retrieves information about a specific <RESOURCE> instance identified by <IDENTIFIER><br />
| [//zinc15.docking.org/genes/ADRB1 /genes/ADRB1]<br />
|- <br />
| Single Resource Relation <br />
| /<RESOURCE>/<IDENTIFIER>/<RELATION><br />
| Performs the query on only resources that are linked to a specific instance of a <RESOURCE> identified by <IDENTIFIER> via the <RELATION> relationship. The type of the result will depend on the taret of <RELATION> (see below).<br />
| [//zinc15.docking.org/genes/ADRB1/substances/ /genes/ADRB1/substances]<br />
|}<br />
<br />
=== Parameters ===<br />
<br />
{| class="wikitable"<br />
|- <br />
! Name <br />
! Meaning<br />
! Example <br />
! Example URL<br />
! Options<br />
|- <br />
| <RESOURCE><br />
| The name of a ZINC15 resource.<br />
| catalogs<br />
| [//zinc15.docking.org/catalogs/ /<b>catalogs</b>/]<br />
| [[ZINC15:Resources|Wiki Explanation of Resources]] [//zinc15.docking.org/help/resources/ ZINC15 Meta-resource]<br />
|- <br />
| <IDENTIFIER><br />
| The unique identifier (key) for a specific resource instance.<br />
| ZINC000000000053<br />
| [//zinc15.docking.org/substances/ZINC000000000053 /substances/<b>ZINC000000000053</b>/]<br />
| The identifier column is listed on each resources' "help" page (e.g. [//zinc15.docking.org/substances/help/]) and can also always be accessed via the `public_identifier` attribute of a resource instance.<br />
|-<br />
| <RELATION><br />
| The name of a relation defined on <RESOURCE> objects.<br />
| activities<br />
| [//zinc15.docking.org/substances/ZINC000000000053/activities /substances/ZINC000000000053/<b>activities</b>]<br />
| The name and type of each resources' relations is defined on the "help" page (e.g. [//zinc15.docking.org/substances/help/])<br />
|}<br />
<br />
Note that a query rooted on /catalog/sial/items will not return **catalogs**, but instead **catitems**, as the catalog relation named **items** returns **catitem** resources. This can be seen on the [//zinc15.docking.org/catalogs/help Catalog Help Page].<br />
<br />
<br />
The items in [ square brackets ] are optional. The items in < angled braces > are each described below. <br />
<br />
* <RESOURCE> is an object type, such as molecules, catalogs or genes, and are fully described here: [[ZINC15:Resources]]<br />
<br />
* <FORMAT> is one of the supported formats, such as smi, sdf, csv, fully described here: [[ZINC15:Formats]]. If a format is omitted, a webpage is implicitly requested.<br />
<br />
* <FIELDS> are individual properties, including calculated ones, and are described together with the [[ZINC15:Resources]] to which they belong.<br />
[[ZINC15:Properties]] Not all properties make sense in all resource contexts. [[ZINC15:examples]] are provided. Each resource has default fields, if none are specified. <br />
<br />
* <ENDPOINT> is one of (tile, table, detail, etc.) (only available if the underlying template exists); where items in [ square brackets ] are optional, and:<br />
<br />
[[ZINC15:Variants]].<br />
<br />
* <PREDICATE_LIST> is a query string, one or more of <PREDICATE> delimited by &<PREDICATE> is <ATTRIBUTE>[:<OPERATOR>[;<OPERATOR_ARGS>]]=<THRESHOLD><br />
<br />
* [[ZINC15:Page Controls]] are optional, and are used to qualify how a search is to be performed and formatted for the page.<br />
<br />
* [[ZINC15:Query operators]]<br />
<br />
* https is currently not supported, but it will be.<br />
<br />
= Reserved words = <br />
* list, tile, subsets, help, overview - these have special meaning in the URL, and can never be the names of resources or their columns. <br />
<br />
= Examples =<br />
We illustrate the use of the website and the API with [[ZINC15:examples]]. <br />
<br />
Back to [[ZINC15]]<br />
[[Category:API]]<br />
[[Category:ZINC]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=ZINC15:Syntax&diff=9207ZINC15:Syntax2015-10-14T18:07:34Z<p>Teague Sterling: </p>
<hr />
<div>ZINC 15 uses a uniform set of rules to interpret the URL allowing both web pages and a machine-readable application programming interface (API). This page describes both.<br />
<br />
{{TOCright}}<br />
<br />
= Overview =<br />
ZINC15 queries for all resource types can be formulated as HTTP requests using a consistent URL syntax. The syntax is consistent for both interactive (browser-based) and API (script-based) queries. The API downloads are triggered by specifying a download format and, optionally, a list of output fields to include.<br />
<br />
All URLs will be provided relative to [//zinc15.docking.org zinc15.docking.org], thus /substances/ implies [http://zinc15.docking.org/substances/ zinc15.docking.org/substances/].<br />
<br />
The full syntax of a query is defined as:<br />
<pre style="overflow-x: scroll"><br />
/<RESOURCE>[/<IDENTIFIER>[/<RELATION>]][/subsets/<SUBSET>[+<SUBSET>...]][/having/[no-]<EXT_RELATION>[+[no-]<EXT_RELATION>...]][/subsets/[<EXT_RELATION>.]<RELATION_SUBSET>+[[<EXT_RELATION>.]<RELATION_SUBSET>...]](/[<VIEW>.html]|.<FORMAT>[:<OUTPUT_FIELD>[+<OUTPUT_FIELD>...]])[?<OPTION_NAME>=<OPTION_VALUE>|<QUERY_FIELD>[-<OPERATOR>[-<OPERATOR_ARG>...]]=<FIELD_CONSTRAINT>[&<OPTION_NAME>=<OPTION_VALUE>|<QUERY_FIELD>[-<OPERATOR>[-<OPERATOR_ARG>...]]=<FIELD_CONSTRAINT>...]]<br />
</pre><br />
<br />
More Concisely, this can be written as:<br />
<pre style="overflow-x: scroll"><br />
/<Root>[/<Subsets>][/<Existential>[/<Ext. Subsets>]]<Formatting>?[<Constraints>][<Options>]<br />
</pre><br />
<br />
Broken down into individual components:<br />
<br />
{| class="wikitable"<br />
|- <br />
! Syntax<br />
! Name<br />
! Explanation<br />
! Example<br />
! Notes<br />
|-<br />
| <RESOURCE>[/<IDENTIFIER>[/<RELATION>]]<br />
| Query Root (Root)<br />
| Defines where the query should be performed: either on the whole resource or only those related to another, single resource.<br />
| [//zinc15.docking.org/trials/NCT00001251/substances/ /trials/NCT00001251/substances]<br />
| Valid values for <IDENTIFIER> and <RELATION> parts of the query root are defined separate for each resource.<br />
|-<br />
| subsets/<SUBSET>[+<SUBSET>...]<br />
| Query Subsets (Subsets)<br />
| Applies one or more named, predefined constraints to the query.<br />
| [//zinc15.docking.org/substances/subsets/fda+for-sale/ .../subsets/fda+for-sale/]<br />
| Subsets are defined separately for each resource. Some subsets are disjoint, so it's possible to accidentally construct null queries<br />
|-<br />
| having/[no-]<EXT_RELATION>[+[no-]<EXT_RELATION>...]<br />
| Existential Requirements (Existential)<br />
| Applies existential (or non-existential with the "no-" prefix) constraints based on on the results of the query<br />
| [//zinc15.docking.org/genes/having/no-protomers/ .../having/no-protomers/]<br />
| The available relations depend on the resultant resource, which is not necessarily the same as the root resource<br />
|-<br />
| subsets/[<EXT_RELATION>.]<RELATION_SUBSET>+[[<EXT_RELATION>.]<RELATION_SUBSET>...]<br />
| Existential Relation Subsets (Ext. Subsets)<br />
| Applies subset requirements to the **existential relation requirement** (see above) of the query<br />
| [//zinc15.docking.org/substances/having/trials/subsets/cancer/ .../subsets/cancer/]<br />
| The relation subsets are completely independent of the query subsets, and valid values depend on the type of the existential resource. When multiple existential relations are specified, the subsets must be prefixed with "<EXT_RELATION>." to identify which they apply to.<br />
|- <br />
| (/[<VIEW>.html]|.<FORMAT>[:<OUTPUT_FIELD>[+<OUTPUT_FIELD>...]])<br />
| Formatting Options (Formatting)<br />
| Dictates how the result of a query will be returned. If <FORMAT> is specified the result will be sent as a download. If a <VIEW> is specified the result will be rendered as HTML. If neither is provided (just "/") the "**Accept**" header will be interrogated to determine the requested format, which defaults to HTML.<br />
| [//zinc15.docking.org/majorclasses.json%3Apublic_identifier+name+num_genes+num_substances .json:public_identifier+name +num_genes+num_substances]<br />
| Valid views (currently) include table or tile. The default is always tile. Valid [[ZINC15:Formats]] are explained in the wiki. Valid <OUTPUT_FIELD>s depend on the specific resource being requested, and can be found on the resource help pages.<br />
|}<br />
<br />
<br />
== Query Root ==<br />
<br />
There are three basic types of queries supported by ZINC15.<br />
<br />
{| class="wikitable"<br />
|- <br />
! Kind <br />
! Syntax <br />
! Description <br />
! Example <br />
|- <br />
| Full Resource Listing <br />
| /<RESOURCE> <br />
| Performs the query on all <RESOURCE>s<br />
| [//zinc15.docking.org/genes/ /genes]<br />
|- <br />
| Single Resource Lookup <br />
| /<RESOURCE>/<IDENTIFIER><br />
| Retrieves information about a specific <RESOURCE> instance identified by <IDENTIFIER><br />
| [//zinc15.docking.org/genes/ADRB1 /genes/ADRB1]<br />
|- <br />
| Single Resource Relation <br />
| /<RESOURCE>/<IDENTIFIER>/<RELATION><br />
| Performs the query on only resources that are linked to a specific instance of a <RESOURCE> identified by <IDENTIFIER> via the <RELATION> relationship. The type of the result will depend on the taret of <RELATION> (see below).<br />
| [//zinc15.docking.org/genes/ADRB1/substances/ /genes/ADRB1/substances]<br />
|}<br />
<br />
=== Parameters ===<br />
<br />
{| class="wikitable"<br />
|- <br />
! Name <br />
! Meaning<br />
! Example <br />
! Example URL<br />
! Options<br />
|- <br />
| <RESOURCE><br />
| The name of a ZINC15 resource.<br />
| catalogs<br />
| [//zinc15.docking.org/catalogs/ /<b>catalogs</b>/]<br />
| [[ZINC15:Resources|Wiki Explanation of Resources]] [//zinc15.docking.org/help/resources/ ZINC15 Meta-resource]<br />
|- <br />
| <IDENTIFIER><br />
| The unique identifier (key) for a specific resource instance.<br />
| ZINC000000000053<br />
| [//zinc15.docking.org/substances/ZINC000000000053 /substances/<b>ZINC000000000053</b>/]<br />
| The identifier column is listed on each resources' "help" page (e.g. [//zinc15.docking.org/substances/help/]) and can also always be accessed via the `public_identifier` attribute of a resource instance.<br />
|-<br />
| <RELATION><br />
| The name of a relation defined on <RESOURCE> objects.<br />
| activities<br />
| [//zinc15.docking.org/substances/ZINC000000000053/activities /substances/ZINC000000000053/<b>activities</b>]<br />
| The name and type of each resources' relations is defined on the "help" page (e.g. [//zinc15.docking.org/substances/help/])<br />
|}<br />
<br />
Note that a query rooted on /catalog/sial/items will not return **catalogs**, but instead **catitems**, as the catalog relation named **items** returns **catitem** resources. This can be seen on the [//zinc15.docking.org/catalogs/help Catalog Help Page].<br />
<br />
<br />
The items in [ square brackets ] are optional. The items in < angled braces > are each described below. <br />
<br />
* <RESOURCE> is an object type, such as molecules, catalogs or genes, and are fully described here: [[ZINC15:Resources]]<br />
<br />
* <FORMAT> is one of the supported formats, such as smi, sdf, csv, fully described here: [[ZINC15:Formats]]. If a format is omitted, a webpage is implicitly requested.<br />
<br />
* <FIELDS> are individual properties, including calculated ones, and are described together with the [[ZINC15:Resources]] to which they belong.<br />
[[ZINC15:Properties]] Not all properties make sense in all resource contexts. [[ZINC15:examples]] are provided. Each resource has default fields, if none are specified. <br />
<br />
* <ENDPOINT> is one of (tile, table, detail, etc.) (only available if the underlying template exists); where items in [ square brackets ] are optional, and:<br />
<br />
[[ZINC15:Variants]].<br />
<br />
* <PREDICATE_LIST> is a query string, one or more of <PREDICATE> delimited by &<PREDICATE> is <ATTRIBUTE>[:<OPERATOR>[;<OPERATOR_ARGS>]]=<THRESHOLD><br />
<br />
* [[ZINC15:Page Controls]] are optional, and are used to qualify how a search is to be performed and formatted for the page.<br />
<br />
* [[ZINC15:Query operators]]<br />
<br />
* https is currently not supported, but it will be.<br />
<br />
= Reserved words = <br />
* list, tile, subsets, help, overview - these have special meaning in the URL, and can never be the names of resources or their columns. <br />
<br />
= Examples =<br />
We illustrate the use of the website and the API with [[ZINC15:examples]]. <br />
<br />
Back to [[ZINC15]]<br />
[[Category:API]]<br />
[[Category:ZINC]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=ZINC15:Syntax&diff=9206ZINC15:Syntax2015-10-14T17:55:05Z<p>Teague Sterling: /* Overall Syntax */</p>
<hr />
<div>ZINC 15 uses a uniform set of rules to interpret the URL allowing both web pages and a machine-readable application programming interface (API). This page describes both.<br />
<br />
{{TOCright}}<br />
<br />
= Overview =<br />
ZINC15 queries for all resource types can be formulated as HTTP requests using a consistent URL syntax. The syntax is consistent for both interactive (browser-based) and API (script-based) queries. The API downloads are triggered by specifying a download format and, optionally, a list of output fields to include.<br />
<br />
All URLs will be provided relative to [//zinc15.docking.org zinc15.docking.org], thus /substances/ implies [http://zinc15.docking.org/substances/ zinc15.docking.org/substances/].<br />
<br />
The full syntax of a query is defined as:<br />
<pre style="overflow-x: scroll"><br />
/<RESOURCE>[/<IDENTIFIER>[/<RELATION>]][/subsets/<SUBSET>[+<SUBSET>...]][/having/[no-]<EXT_RELATION>[+[no-]<EXT_RELATION>...]][/subsets/[<EXT_RELATION>.]<RELATION_SUBSET>+[[<EXT_RELATION>.]<RELATION_SUBSET>...]](/[<VIEW>.html]|.<FORMAT>[:<OUTPUT_FIELDS>])[?<OUTPUT_OPTION_NAME>=<OUTPUT_OPTION_VALUE>|<QUERY_FIELD>[-<OPERATOR>[-<OPERATOR_ARG>...]]=<FIELD_CONSTRAINT>[&<OUTPUT_OPTION_NAME>=<OUTPUT_OPTION_VALUE>|<QUERY_FIELD>[-<OPERATOR>[-<OPERATOR_ARG>...]]=<FIELD_CONSTRAINT>...]]<br />
</pre><br />
<br />
More Concisely, this can be written as:<br />
<pre style="overflow-x: scroll"><br />
/<Root>[/<Subsets>][/<Existential>[/<Ext. Subsets>]][Formatting]?[Constraints][Options]<br />
</pre><br />
<br />
Broken down into individual components:<br />
<br />
{| class="wikitable"<br />
|- <br />
! Syntax<br />
! Name<br />
! Explanation<br />
! Example<br />
! Notes<br />
|-<br />
| <RESOURCE>[/<IDENTIFIER>[/<RELATION>]]<br />
| Query Root (Root)<br />
| Defines where the query should be performed: either on the whole resource or only those related to another, single resource.<br />
| [//zinc15.docking.org/trials/NCT00001251/substances/ /trials/NCT00001251/substances]<br />
| Valid values for <IDENTIFIER> and <RELATION> parts of the query root are defined separate for each resource.<br />
|-<br />
| subsets/<SUBSET>[+<SUBSET>...]<br />
| Query Subsets (Subsets)<br />
| Applies one or more named, predefined constraints to the query.<br />
| [//zinc15.docking.org/substances/subsets/fda+for-sale/ .../subsets/fda+for-sale/]<br />
| Subsets are defined separately for each resource. Some subsets are disjoint, so it's possible to accidentally construct null queries<br />
|-<br />
| having/[no-]<EXT_RELATION>[+[no-]<EXT_RELATION>...]<br />
| Existential Requirements (Existential)<br />
| Applies existential (or non-existential with the "no-" prefix) constraints based on on the results of the query<br />
| [//zinc15.docking.org/genes/having/no-protomers/ .../having/no-protomers/]<br />
| The available relations depend on the resultant resource, which is not necessarily the same as the root resource<br />
|-<br />
| subsets/[<EXT_RELATION>.]<RELATION_SUBSET>+[[<EXT_RELATION>.]<RELATION_SUBSET>...]<br />
| Existential Relation Subsets (Ext. Subsets)<br />
| Applies subset requirements to the **existential relation requirement** (see above) of the query<br />
| [//zinc15.docking.org/substances/having/trials/subsets/cancer/ .../subsets/cancer/]<br />
| The relation subsets are completely independent of the query subsets, and valid values depend on the type of the existential resource. When multiple existential relations are specified, the subsets must be prefixed with "<EXT_RELATION>." to identify which they apply to.<br />
|- <br />
| (/[<VIEW>.html]|.<FORMAT>[:<br />
|}<br />
<br />
[/subsets/<RELATION_SUBSET>+[<RELATION_SUBSET>...]]<br />
<br />
<br />
== Query Root ==<br />
<br />
There are three basic types of queries supported by ZINC15.<br />
<br />
{| class="wikitable"<br />
|- <br />
! Kind <br />
! Syntax <br />
! Description <br />
! Example <br />
|- <br />
| Full Resource Listing <br />
| /<RESOURCE> <br />
| Performs the query on all <RESOURCE>s<br />
| [//zinc15.docking.org/genes/ /genes]<br />
|- <br />
| Single Resource Lookup <br />
| /<RESOURCE>/<IDENTIFIER><br />
| Retrieves information about a specific <RESOURCE> instance identified by <IDENTIFIER><br />
| [//zinc15.docking.org/genes/ADRB1 /genes/ADRB1]<br />
|- <br />
| Single Resource Relation <br />
| /<RESOURCE>/<IDENTIFIER>/<RELATION><br />
| Performs the query on only resources that are linked to a specific instance of a <RESOURCE> identified by <IDENTIFIER> via the <RELATION> relationship. The type of the result will depend on the taret of <RELATION> (see below).<br />
| [//zinc15.docking.org/genes/ADRB1/substances/ /genes/ADRB1/substances]<br />
|}<br />
<br />
=== Parameters ===<br />
<br />
{| class="wikitable"<br />
|- <br />
! Name <br />
! Meaning<br />
! Example <br />
! Example URL<br />
! Options<br />
|- <br />
| <RESOURCE><br />
| The name of a ZINC15 resource.<br />
| catalogs<br />
| [//zinc15.docking.org/catalogs/ /<b>catalogs</b>/]<br />
| [[ZINC15:Resources|Wiki Explanation of Resources]] [//zinc15.docking.org/help/resources/ ZINC15 Meta-resource]<br />
|- <br />
| <IDENTIFIER><br />
| The unique identifier (key) for a specific resource instance.<br />
| ZINC000000000053<br />
| [//zinc15.docking.org/substances/ZINC000000000053 /substances/<b>ZINC000000000053</b>/]<br />
| The identifier column is listed on each resources' "help" page (e.g. [//zinc15.docking.org/substances/help/]) and can also always be accessed via the `public_identifier` attribute of a resource instance.<br />
|-<br />
| <RELATION><br />
| The name of a relation defined on <RESOURCE> objects.<br />
| activities<br />
| [//zinc15.docking.org/substances/ZINC000000000053/activities /substances/ZINC000000000053/<b>activities</b>]<br />
| The name and type of each resources' relations is defined on the "help" page (e.g. [//zinc15.docking.org/substances/help/])<br />
|}<br />
<br />
Note that a query rooted on /catalog/sial/items will not return **catalogs**, but instead **catitems**, as the catalog relation named **items** returns **catitem** resources. This can be seen on the [//zinc15.docking.org/catalogs/help Catalog Help Page].<br />
<br />
<br />
The items in [ square brackets ] are optional. The items in < angled braces > are each described below. <br />
<br />
* <RESOURCE> is an object type, such as molecules, catalogs or genes, and are fully described here: [[ZINC15:Resources]]<br />
<br />
* <FORMAT> is one of the supported formats, such as smi, sdf, csv, fully described here: [[ZINC15:Formats]]. If a format is omitted, a webpage is implicitly requested.<br />
<br />
* <FIELDS> are individual properties, including calculated ones, and are described together with the [[ZINC15:Resources]] to which they belong.<br />
[[ZINC15:Properties]] Not all properties make sense in all resource contexts. [[ZINC15:examples]] are provided. Each resource has default fields, if none are specified. <br />
<br />
* <ENDPOINT> is one of (tile, table, detail, etc.) (only available if the underlying template exists); where items in [ square brackets ] are optional, and:<br />
<br />
[[ZINC15:Variants]].<br />
<br />
* <PREDICATE_LIST> is a query string, one or more of <PREDICATE> delimited by &<PREDICATE> is <ATTRIBUTE>[:<OPERATOR>[;<OPERATOR_ARGS>]]=<THRESHOLD><br />
<br />
* [[ZINC15:Page Controls]] are optional, and are used to qualify how a search is to be performed and formatted for the page.<br />
<br />
* [[ZINC15:Query operators]]<br />
<br />
* https is currently not supported, but it will be.<br />
<br />
= Reserved words = <br />
* list, tile, subsets, help, overview - these have special meaning in the URL, and can never be the names of resources or their columns. <br />
<br />
= Examples =<br />
We illustrate the use of the website and the API with [[ZINC15:examples]]. <br />
<br />
Back to [[ZINC15]]<br />
[[Category:API]]<br />
[[Category:ZINC]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Decoys&diff=8955Decoys2015-09-15T18:38:53Z<p>Teague Sterling: </p>
<hr />
<div>[[Decoy Theory | Decoys are...]]<br />
<br />
Decoys are important for judging the performance of [[molecular docking]] algorithms. <br />
<br />
If you want decoys for a molecule in ZINC, say 556, use <br />
http://zinc15.docking.org/substances/ZINC000000000556/decoys/3D.sdf<br />
<br />
If you want decoys in 2D:<br />
http://zinc15.docking.org/substances/ZINC000019632927/decoys/2D/<br />
<br />
http://zinc15.docking.org/apps/mol/decoys?for=CN1CCN(CC(=O)N2c3ccccc3C(=O)Nc3cccnc32)CC1<br />
<br />
== Examples ==<br />
<br />
=== Getting decoys for Aspirin (ZINC000000000053) ===<br />
2D Decoys (need to generate DB2 files and compute charge yourself)<br />
Visualize: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D?><br />
Download: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D.smi?count=100><br />
<br />
3D Decoys (Can DOCK directly)<br />
Visualize: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D><br />
Download (SMILES & Explicit Charge): <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.smi?net_charge=-1&count=100><br />
Download (DB2): <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.db2.gz?count=100><br />
<br />
Allowed args: <br />
* count: How many<br />
* resolve: Look up zinc_ids if possible<br />
* unique: Only return unique decoys (if you POST with "for" as a file instead)<br />
<br />
You could also do decoys "by hand" to have more control:<br />
http://zinc15.docking.org/substances/?~ecfp4_fp-unsorted_tanimoto,.2=zinc55&mwt-between=270,330&logp-between=1.8,2.4&purchasability=for-sale<br />
<br />
=== Getting Decoys for a SMILES (e.g. not in ZINC) ===<br />
Getting Decoys for a SMILES string can be written as a general ZINC query on either the **substances** (2D) resource or the **protomers** resource (3D). Constraints can be added for further refinement.<br />
Currently you need to have a smiles string (e.g. `CN(C)C(=O)c1ccc(O)cc1`) as well as the physical properties you of that compound (i.e. molecular weight, logp, etc.). This requirement will soon be removed and replaced with a simple form.<br />
<br />
Molecular Properties (Criteria) → URL Parameters:<br />
<br />
SMILES: CN(C)C(=O)c1ccc(O)cc1 (ECFP4-tanimoto < 0.25) → ~substance.ecfp4_fp-unsorted_tanimoto-25=CN(C)C(=O)c1ccc(O)cc1<br />
MWT: 165.2 (+/- 10%) → mwt-between=148.67,181.72<br />
LogP: 1.1 (+/- 0.30) → logp-between=0.8,1.4<br />
Donors: 1 (==) → hbd=1<br />
Acceptors: 2 (==) → hba=2<br />
Charge: 0 (==) → net_charge=0<br />
Rotatable Bonds: 1 (==) → rb=1<br />
<br />
As a ZINC query these constraints would be:<br />
<br />
<http://zinc15.docking.org/protomers/?~substance.ecfp4_fp-unsorted_tanimoto-25=CN(C)C(=O)c1ccc(O)cc1&mwt-between=148.67,181.72&logp-between=0.8,1.4&hbd=1&hba=2&net_charge=0&rb=1><br />
<br />
This query can be made faster by requesting only 50 and adding a few additional execution rules: count=50&parallelize=no&distinct=no<br />
<br />
<http://zinc15.docking.org/protomers/?~substance.ecfp4_fp-unsorted_tanimoto-25=CN(C)C(=O)c1ccc(O)cc1&mwt-between=148.67,181.72&logp-between=0.8,1.4&hbd=1&hba=2&net_charge=0&rb=1&count=50parallelize=no&distinct=no><br />
<br />
[[DUDE]] is a free directory of useful decoys for [[virtual screening]].<br />
<br />
For more information on preparation see<br />
<br />
[[Automated_Database_Preparation#Automatic_Decoy_Generation]]<br />
<br />
<br />
[[Category:Jargon]]<br />
[[Category:DOCK:Scoring problem]]<br />
[[Category:DOCK:Sampling problem]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Decoys&diff=8954Decoys2015-09-15T18:37:34Z<p>Teague Sterling: /* Getting Decoys for a SMILES (e.g. not in ZINC) */</p>
<hr />
<div>[[Decoy Theory | Decoys are...]]<br />
<br />
Decoys are important for judging the performance of [[molecular docking]] algorithms. <br />
<br />
If you want decoys for a molecule in ZINC, say 556, use <br />
http://zinc15.docking.org/substances/ZINC000000000556/decoys/3D.sdf<br />
<br />
If you want decoys in 2D:<br />
http://zinc15.docking.org/substances/ZINC000019632927/decoys/2D/<br />
<br />
http://zinc15.docking.org/apps/mol/decoys?for=CN1CCN(CC(=O)N2c3ccccc3C(=O)Nc3cccnc32)CC1<br />
<br />
== Examples ==<br />
<br />
=== Getting decoys for Aspirin (ZINC000000000053) ===<br />
2D Decoys (need to generate DB2 files and compute charge yourself)<br />
Visualize: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D?><br />
Download: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D.smi?count=100><br />
<br />
3D Decoys (Can DOCK directly)<br />
Visualize: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D><br />
Download (SMILES & Explicit Charge): <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.smi?net_charge=-1&count=100><br />
Download (DB2): <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.db2.gz?count=100><br />
<br />
Allowed args: <br />
* count: How many<br />
* resolve: Look up zinc_ids if possible<br />
* unique: Only return unique decoys (if you POST with "for" as a file instead)<br />
<br />
You could also do decoys "by hand" to have more control:<br />
http://zinc15.docking.org/substances/?~ecfp4_fp-unsorted_tanimoto,.2=zinc55&mwt-between=270,330&logp-between=1.8,2.4&purchasability=for-sale<br />
<br />
=== Getting Decoys for a SMILES (e.g. not in ZINC) ===<br />
Getting Decoys for a SMILES string can be written as a general ZINC query on either the **substances** (2D) resource or the **protomers** resource (3D). Constraints can be added for further refinement.<br />
Currently you need to have a smiles string (e.g. `CN(C)C(=O)c1ccc(O)cc1`) as well as the physical properties you of that compound (i.e. molecular weight, logp, etc.). This requirement will soon be removed and replaced with a simple form.<br />
<br />
Molecular Properties (Criteria) → URL Parameters:<br />
<br />
SMILES: CN(C)C(=O)c1ccc(O)cc1 (ECFP4-tanimoto < 0.25) → ~substance.ecfp4_fp-unsorted_tanimoto-25=CN(C)C(=O)c1ccc(O)cc1<br />
MWT: 165.2 (+/- 10%) → mwt-between=148.67,181.72<br />
LogP: 1.1 (+/- 0.30) → logp-between=0.8,1.4<br />
Donors: 1 (==) → hbd=1<br />
Acceptors: 2 (==) → hba=2<br />
Charge: 0 (==) → charge=0<br />
Rotatable Bonds: 1 (==) → rb=1<br />
<br />
As a ZINC query these constraints would be:<br />
<br />
<http://zinc15.docking.org/protomers/?~substance.ecfp4_fp-unsorted_tanimoto-25=CN(C)C(=O)c1ccc(O)cc1&mwt-between=148.67,181.72&logp-between=0.8,1.4&hbd=1&hba=2&charge=0&rb=1><br />
<br />
This query can be made faster by requesting only 50 and adding a few additional execution rules: count=50&parallelize=no&distinct=no<br />
<br />
<http://zinc15.docking.org/protomers/?~substance.ecfp4_fp-unsorted_tanimoto-25=CN(C)C(=O)c1ccc(O)cc1&mwt-between=148.67,181.72&logp-between=0.8,1.4&hbd=1&hba=2&charge=0&rb=1&count=50parallelize=no&distinct=no><br />
<br />
[[DUDE]] is a free directory of useful decoys for [[virtual screening]].<br />
<br />
For more information on preparation see<br />
<br />
[[Automated_Database_Preparation#Automatic_Decoy_Generation]]<br />
<br />
<br />
[[Category:Jargon]]<br />
[[Category:DOCK:Scoring problem]]<br />
[[Category:DOCK:Sampling problem]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Decoys&diff=8953Decoys2015-09-15T18:32:55Z<p>Teague Sterling: </p>
<hr />
<div>[[Decoy Theory | Decoys are...]]<br />
<br />
Decoys are important for judging the performance of [[molecular docking]] algorithms. <br />
<br />
If you want decoys for a molecule in ZINC, say 556, use <br />
http://zinc15.docking.org/substances/ZINC000000000556/decoys/3D.sdf<br />
<br />
If you want decoys in 2D:<br />
http://zinc15.docking.org/substances/ZINC000019632927/decoys/2D/<br />
<br />
http://zinc15.docking.org/apps/mol/decoys?for=CN1CCN(CC(=O)N2c3ccccc3C(=O)Nc3cccnc32)CC1<br />
<br />
== Examples ==<br />
<br />
=== Getting decoys for Aspirin (ZINC000000000053) ===<br />
2D Decoys (need to generate DB2 files and compute charge yourself)<br />
Visualize: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D?><br />
Download: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D.smi?count=100><br />
<br />
3D Decoys (Can DOCK directly)<br />
Visualize: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D><br />
Download (SMILES & Explicit Charge): <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.smi?net_charge=-1&count=100><br />
Download (DB2): <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.db2.gz?count=100><br />
<br />
Allowed args: <br />
* count: How many<br />
* resolve: Look up zinc_ids if possible<br />
* unique: Only return unique decoys (if you POST with "for" as a file instead)<br />
<br />
You could also do decoys "by hand" to have more control:<br />
http://zinc15.docking.org/substances/?~ecfp4_fp-unsorted_tanimoto,.2=zinc55&mwt-between=270,330&logp-between=1.8,2.4&purchasability=for-sale<br />
<br />
=== Getting Decoys for a SMILES (e.g. not in ZINC) ===<br />
Getting Decoys for a SMILES string can be written as a general ZINC query on either the **substances** (2D) resource or the **protomers** resource (3D). Constraints can be added for further refinement.<br />
Currently you need to have a smiles string (e.g. `CN(C)C(=O)c1ccc(O)cc1`) as well as the physical properties you of that compound (i.e. molecular weight, logp, etc.). This requirement will soon be removed and replaced with a simple form.<br />
<br />
Molecular Properties (Criteria) → URL Parameters:<br />
<br />
SMILES: CN(C)C(=O)c1ccc(O)cc1 (ECFP4-tanimoto < 0.25) → ~substance.ecfp4_fp-unsorted_tanimoto-25=CN(C)C(=O)c1ccc(O)cc1<br />
MWT: 165.2 (+/- 10%) → mwt-between=148.67,181.72<br />
LogP: 1.1 (+/- 0.30) → logp-between=0.8,1.4<br />
Donors: 1 (==) → hbd=1<br />
Acceptors: 2 (==) → hba=2<br />
Charge: 0 (==) → charge=0<br />
Rotatable Bonds: 1 (==) → rb=1<br />
<br />
As a ZINC query these constraints would be:<br />
<br />
<http://zinc15.docking.org/protomers/?~substance.ecfp4_fp-unsorted-tanimoto-25=CN(C)C(=O)c1ccc(O)cc1&mwt-between=148.67,181.72&logp-between=0.8,1.4&hbd=1&hba=2&charge=0&rb=1><br />
<br />
This query can be made faster by requesting only 50 and adding a few additional execution rules: count=50&parallelize=no&distinct=no<br />
<br />
<http://zinc15.docking.org/protomers/?~substance.ecfp4_fp-unsorted-tanimoto-25=CN(C)C(=O)c1ccc(O)cc1&mwt-between=148.67,181.72&logp-between=0.8,1.4&hbd=1&hba=2&charge=0&rb=1&count=50parallelize=no&distinct=no><br />
<br />
[[DUDE]] is a free directory of useful decoys for [[virtual screening]].<br />
<br />
For more information on preparation see<br />
<br />
[[Automated_Database_Preparation#Automatic_Decoy_Generation]]<br />
<br />
<br />
[[Category:Jargon]]<br />
[[Category:DOCK:Scoring problem]]<br />
[[Category:DOCK:Sampling problem]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Decoys&diff=8952Decoys2015-09-15T18:23:11Z<p>Teague Sterling: /* = More Examples */</p>
<hr />
<div>[[Decoy Theory | Decoys are...]]<br />
<br />
Decoys are important for judging the performance of [[molecular docking]] algorithms. <br />
<br />
If you want decoys for a molecule in ZINC, say 556, use <br />
http://zinc15.docking.org/substances/ZINC000000000556/decoys/3D.sdf<br />
<br />
If you want decoys in 2D:<br />
http://zinc15.docking.org/substances/ZINC000019632927/decoys/2D/<br />
<br />
http://zinc15.docking.org/apps/mol/decoys?for=CN1CCN(CC(=O)N2c3ccccc3C(=O)Nc3cccnc32)CC1<br />
<br />
== Examples ==<br />
<br />
=== Getting decoys for Aspirin (ZINC000000000053) ===<br />
2D Decoys (need to generate DB2 files and compute charge yourself)<br />
Visualize: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D?><br />
Download: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D.smi?count=100><br />
<br />
3D Decoys (Can DOCK directly)<br />
Visualize: <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D><br />
Download (SMILES & Explicit Charge): <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.smi?net_charge=-1&count=100><br />
Download (DB2): <http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.db2.gz?count=100><br />
<br />
Allowed args: <br />
* count: How many<br />
* resolve: Look up zinc_ids if possible<br />
* unique: Only return unique decoys (if you POST with "for" as a file instead)<br />
<br />
You could also do decoys "by hand" to have more control:<br />
http://zinc15.docking.org/substances/?~ecfp4_fp-unsorted_tanimoto,.2=zinc55&mwt-between=270,330&logp-between=1.8,2.4&purchasability=for-sale<br />
<br />
=== Getting Decoys for a SMILES (e.g. not in ZINC) ===<br />
Getting Decoys for a SMILES string can be written as a general ZINC query on either the **substances** (2D) resource or the **protomers** resource (3D). Constraints can be added for further refinement.<br />
Currently you need to have a smiles string (e.g. `CN(C)C(=O)c1ccc(O)cc1`) as well as the physical properties you of that compound (i.e. molecular weight, logp, etc.). This requirement will soon be removed and replaced with a simple form.<br />
<br />
Molecular Properties (Criteria) → URL Parameters:<br />
<br />
SMILES: CN(C)C(=O)c1ccc(O)cc1 (ECFP4-tanimoto < 0.25) → ~substance.ecfp4_fp-tanimoto=CN(C)C(=O)c1ccc(O)cc1<br />
MWT: 165.2 (+/- 10%) → mwt-between=148.67,181.72<br />
LogP: 1.1 (+/- 0.30) → logp-between=0.8,1.4<br />
Donors: 1 (==) → hbd=1<br />
Acceptors: 2 (==) → hba=2<br />
Charge: 0 (==) → charge=0<br />
Rotatable Bonds: 1 (==) → rb=1<br />
<br />
As a ZINC query these constraints would be:<br />
<br />
<http://zinc15.docking.org/protomers/?~substance.ecfp4_fp-tanimoto=CN(C)C(=O)c1ccc(O)cc1&mwt-between=148.67,181.72&logp-between=0.8,1.4&hbd=1&hba=2&charge=0&rb=1><br />
<br />
This query can be made faster by requesting only 50 and adding a few additional execution rules: count=50&parallelize=no&distinct=no<br />
<br />
<http://zinc15.docking.org/protomers/?~substance.ecfp4_fp-tanimoto=CN(C)C(=O)c1ccc(O)cc1&mwt-between=148.67,181.72&logp-between=0.8,1.4&hbd=1&hba=2&charge=0&rb=1&count=50parallelize=no&distinct=no><br />
<br />
[[DUDE]] is a free directory of useful decoys for [[virtual screening]].<br />
<br />
For more information on preparation see<br />
<br />
[[Automated_Database_Preparation#Automatic_Decoy_Generation]]<br />
<br />
<br />
[[Category:Jargon]]<br />
[[Category:DOCK:Scoring problem]]<br />
[[Category:DOCK:Sampling problem]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Decoys&diff=8950Decoys2015-09-15T17:13:44Z<p>Teague Sterling: </p>
<hr />
<div>[[Decoy Theory | Decoys are...]]<br />
<br />
Decoys are important for judging the performance of [[molecular docking]] algorithms. <br />
<br />
If you want decoys for a molecule in ZINC, say 556, use <br />
http://zinc15.docking.org/substances/ZINC000000000556/decoys/3D.sdf<br />
<br />
If you want decoys in 2D:<br />
http://zinc15.docking.org/substances/ZINC000019632927/decoys/2D/<br />
<br />
http://zinc15.docking.org/apps/mol/decoys?for=CN1CCN(CC(=O)N2c3ccccc3C(=O)Nc3cccnc32)CC1<br />
<br />
=== More Examples ==<br />
<br />
Getting decoys for Aspirin<br />
<br />
2D Decoys (need to generate DB2 files and compute charge yourself)<br />
Visualize:<br />
<http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D?><br />
<br />
Download:<br />
<http://zinc15.docking.org/substances/ZINC000000000053/decoys/2D.smi?count=100><br />
<br />
3D Decoys (Can DOCK directly)<br />
Visualize:<br />
<http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D><br />
<br />
Download (SMILES & Explicit Charge):<br />
<http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.smi?net_charge=-1&count=100><br />
<br />
Download (DB2):<br />
<http://zinc15.docking.org/substances/ZINC000000000053/decoys/3D.db2.gz?count=100><br />
<br />
<br />
Allowed args: <br />
* count: How many<br />
* resolve: Look up zinc_ids if possible<br />
* unique: Only return unique decoys (if you POST with "for" as a file instead)<br />
<br />
<br />
You could also do decoys "by hand" to have more control:<br />
http://zinc15.docking.org/substances/?~ecfp4_fp-unsorted_tanimoto,.2=zinc55&mwt-between=270,330&logp-between=1.8,2.4&purchasability=for-sale<br />
<br />
[[DUDE]] is a free directory of useful decoys for [[virtual screening]].<br />
<br />
For more information on preparation see<br />
<br />
[[Automated_Database_Preparation#Automatic_Decoy_Generation]]<br />
<br />
<br />
[[Category:Jargon]]<br />
[[Category:DOCK:Scoring problem]]<br />
[[Category:DOCK:Sampling problem]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Decoys&diff=8949Decoys2015-09-15T17:11:34Z<p>Teague Sterling: </p>
<hr />
<div>[[Decoy Theory | Decoys are...]]<br />
<br />
Decoys are important for judging the performance of [[molecular docking]] algorithms. <br />
<br />
If you want decoys for a molecule in ZINC, say 556, use <br />
http://zinc15.docking.org/substances/ZINC000000000556/decoys/3D.sdf<br />
<br />
If you want decoys in 2D:<br />
http://zinc15.docking.org/substances/ZINC000019632927/decoys/2D/<br />
<br />
http://zinc15.docking.org/apps/mol/decoys?for=CN1CCN(CC(=O)N2c3ccccc3C(=O)Nc3cccnc32)CC1<br />
<br />
=== Examples<br />
<br />
Getting decoys for Aspirin<br />
<br />
2D Decoys (need to generate DB2 files and compute charge yourself)<br />
Visualize:<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/2D<br />
<br />
Download:<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/2D.smi?count=100<br />
<br />
3D Decoys (Can DOCK directly)<br />
Visualize:<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/3D<br />
<br />
Download (SMILES & Explicit Charge):<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/3D.smi?net_charge=-1&count=100<br />
<br />
Download (DB2):<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/3D.db2.gz?count=100<br />
<br />
<br />
Allowed args: <br />
* count: How many<br />
* resolve: Look up zinc_ids if possible<br />
* unique: Only return unique decoys (if you POST with "for" as a file instead)<br />
<br />
<br />
You could also do decoys "by hand" to have more control:<br />
http://zinc15.docking.org/substances/?~ecfp4_fp-unsorted_tanimoto,.2=zinc55&mwt-between=270,330&logp-between=1.8,2.4&purchasability=for-sale<br />
<br />
[[DUDE]] is a free directory of useful decoys for [[virtual screening]].<br />
<br />
For more information on preparation see<br />
<br />
[[Automated_Database_Preparation#Automatic_Decoy_Generation]]<br />
<br />
<br />
[[Category:Jargon]]<br />
[[Category:DOCK:Scoring problem]]<br />
[[Category:DOCK:Sampling problem]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Decoys&diff=8948Decoys2015-09-15T17:11:10Z<p>Teague Sterling: Adding some example links</p>
<hr />
<div>[[Decoy Theory | Decoys are...]]<br />
<br />
Decoys are important for judging the performance of [[molecular docking]] algorithms. <br />
<br />
If you want decoys for a molecule in ZINC, say 556, use <br />
http://zinc15.docking.org/substances/ZINC000000000556/decoys/3D.sdf<br />
<br />
If you want decoys in 2D:<br />
http://zinc15.docking.org/substances/ZINC000019632927/decoys/2D/<br />
<br />
http://zinc15.docking.org/apps/mol/decoys?for=CN1CCN(CC(=O)N2c3ccccc3C(=O)Nc3cccnc32)CC1<br />
<br />
== Examples<br />
<br />
Getting decoys for Aspirin<br />
<br />
2D Decoys (need to generate DB2 files and compute charge yourself)<br />
Visualize:<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/2D<br />
<br />
Download:<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/2D.smi?count=100<br />
<br />
3D Decoys (Can DOCK directly)<br />
Visualize:<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/3D<br />
<br />
Download (SMILES & Explicit Charge):<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/3D.smi?net_charge=-1&count=100<br />
<br />
Download (DB2):<br />
zinc15.docking.org/substances/ZINC000000000053/decoys/3D.db2.gz?count=100<br />
<br />
<br />
Allowed args: <br />
* count: How many<br />
* resolve: Look up zinc_ids if possible<br />
* unique: Only return unique decoys (if you POST with "for" as a file instead)<br />
<br />
<br />
You could also do decoys "by hand" to have more control:<br />
http://zinc15.docking.org/substances/?~ecfp4_fp-unsorted_tanimoto,.2=zinc55&mwt-between=270,330&logp-between=1.8,2.4&purchasability=for-sale<br />
<br />
[[DUDE]] is a free directory of useful decoys for [[virtual screening]].<br />
<br />
For more information on preparation see<br />
<br />
[[Automated_Database_Preparation#Automatic_Decoy_Generation]]<br />
<br />
<br />
[[Category:Jargon]]<br />
[[Category:DOCK:Scoring problem]]<br />
[[Category:DOCK:Sampling problem]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Creating_clinical_name_mappings&diff=8919Creating clinical name mappings2015-08-13T22:03:42Z<p>Teague Sterling: Updating</p>
<hr />
<div>This is part of the curation of clinical trial data. <br />
<br />
= Day One =<br />
<pre><br />
create temp table subname as select sub_id_fk as sub_id_fk, who_name as name from catalog_item join catalog on (cat_id_fk=cat_id and short_name='chembl20') join chembl20.molecule_dictionary as md on supplier_code=md.chembl_id join chembl20.molecule_atc_classification as mac on md.molregno=mac.molregno join chembl20.atc_classification as ac on mac.level5=ac.level5;<br />
<br />
insert into subname select cs.sub_id_fk, sy.synonym from catalog_substance as cs join synonym as s on cs.cat_content_fk = s.cat_content_fk where not exists (select 1 from subname as sn where sn.sub_id_fk = cs.sub_id_fk and sn.name = sy.synonym);<br />
<br />
alter table subname add column q tsquery;<br />
<br />
update table subname as s set q=plainto_tsquery(t.who_name) from subname as t where s.sub_id_fk=t.sub_id_fk and s.who_name=t.who_name;<br />
<br />
\copy ctinter from druginfo.txt<br />
<br />
alter table ctinter add column terms tsvector;<br />
<br />
update ctinter as x set terms = to_tsvector('english', y.name) from ctinter as y where x.id=y.id;<br />
<br />
create index subanme_idx on subname using gist(q);<br />
create index ctinter_idx on ctinter using gin(terms);<br />
<br />
vacuum analyze verbose ctinter;<br />
vacuum analyze verbose subname;<br />
<br />
select * from ctinter join subname on terms@@q where sub_id_fk=53;<br />
select * from subname join ctinter on q @@ terms where to_tsvector('english', 'biotin') @@ q ;<br />
</pre><br />
<br />
= Day Two =<br />
<br />
<pre><br />
<br />
Keeping track of a bit more here:<br />
<br />
(Assuming ctstatus is already loaded)<br />
<br />
create temporary table cttemp (nct varchar, title varchar, start_date varchar, phase varchar, status varchar);<br />
<br />
\copy cttemp from /nfs/work/teague/Projects/trials/extracted/trials.txt<br />
<br />
update cttemp set start_date = null where start_date='';<br />
<br />
insert into ct2 (ct_code, description, start_date, ctphase_fk, ctstatus_fk) select nct, title, start_date::date, ctphase_id, ctstatus_id from cttemp join ctphase on phase=ctphase.name join ctstatus on status=ctstatus.name;<br />
<br />
create temporary table ct2inttemp (code varchar, kind varchar, name varchar, description varchar);<br />
<br />
\copy ctinttemp from /nfs/work/teague/Projects/trials/extracted/trial_drugs.txt<br />
<br />
insert into clinical1.ct2int (ct2_fk, name) select ct2_id, name from ctinttemp join clinical1.ct2 on code=ct_code;<br />
<br />
create index ix_ct2int_name on clinical1.ct2int (name);<br />
create index ix_ct2int_ct2_fk on clinical1.ct2int (ct2_fk);<br />
<br />
alter table ctinttemp add column terms tsvector;<br />
update ctinttemp set terms=to_tsvector('english', name); <br />
<br />
create index ix_ct2inttemp_terms on ctinttemp using gin (terms);<br />
<br />
vacuum analyze verbose ct2inttemp;<br />
<br />
select sub_id_fk, ct.ct2int_id, ctemp.name as intname, subname.name as subname into temp table ctsubinttemp from subname join ct2inttemp as ctemp on ctemp.terms @@ subname.query join (select ct_code as code, ct2int.name as name, ct2int_id from ct2 join ct2int on ct2_fk=ct2_id) as ct on (ct.name=ctemp.name and ct.code=ctemp.code) where ctemp.kind in ('Drug', 'Nutritional Suppliement');<br />
<br />
-- insert into clinical1.ct2subint (sub_id_fk, ct2int_fk) select distinct sub_id_fk, ct2int_id from ctinttemp as ct join clinical1.ct2 on ct.code=ct_code join clinical1.ct2int on ct2_id=ct2_fk and clinical1.ct2int.name=ct.name join subname on ct.terms@@subname.query;<br />
<br />
-- insert into clinical1.ct2subint (sub_id_fk, ct2int_fk) select distinct sub_id_fk, ct2int_id from ctinttemp as ct join clinical1.ct2 on ct.code=ct_code join clinical1.ct2int as cti on ct2_id=cti.ct2_fk and cti.name=ct.name join subname on ct.terms@@subname.query where not exists(select 1 from clinical1.ct2subint as e where e.sub_id_fk=subname.sub_id_fk and e.ct2int_fk=cti.ct2int_id);<br />
<br />
</pre><br />
<br />
= Day Three =<br />
<pre><br />
create temporary table ctcondtmp (code varchar, name varchar, slug varchar);<br />
<br />
\copy ctcondtemp from /nfs/work/teague/Projects/trials/extracted/trial_conditions.txt<br />
<br />
insert into clinical1.ct2condition (short_name, name) select distinct on (slug) slug, name from ctcondtemp order by slug, name;<br />
<br />
insert into clinical1.ct2tocond (ct2_fk, ct2condition_fk) select ct2_id, ct2condition_id from ctcondtemp join clinical1.ct2 on code=ct_code join clinical1.ct2condition on slug=short_name;<br />
<br />
alter table ctcondtemp add column terms tsvector;<br />
update ctcondtemp set terms=to_tsvector('english', name);<br />
create index ix_ctcondtemp_terms on ctcondtemp using gist(terms);<br />
vacuum analyze verbose ctcondtemp;<br />
<br />
insert into clinical1.ct2condclass (name, description) values ('cancer', 'All classes of cancer') returning ct2condclass_id;<br />
<br />
# returns 1<br />
<br />
update clinical1.ct2condition set condclass_fk=1 where ct2condition_id in (select ct2condition_id from ctcondtemp join clinical1.ct2condition on slug=short_name where terms @@ to_tsquery('cancer | neoplasm | oncology | leukemia | melanoma | carcinoma '));<br />
<br />
</pre><br />
<br />
[[Category:Curator]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=DISI:About&diff=8796DISI:About2015-06-27T18:32:09Z<p>Teague Sterling: </p>
<hr />
<div>[http://wiki.compbio.ucsf.edu/wiki/ DISI], an acronym for DISI Is Still Incomplete or perhaps Documentation Is Still Incomplete.<br />
<br />
This wiki documents software and databases in the both the [[Shoichet Lab]] and [[Irwin Lab]] at [[UCSF]]. These are :<br />
* target-based virtual screening (molecular docking)<br />
* ligand-based ligand discovery<br />
* cheminformatics and other methods of computational ligand discovery<br />
* associated tools, approaches and ideas.<br />
<br />
If this interests you, we invite you to [[contribute]]. We assert [[DISI:Copyrights | copyright]].<br />
<br />
[[Category:Info]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Irwin_Lab&diff=8795Irwin Lab2015-06-27T18:31:00Z<p>Teague Sterling: Created page with "The Irwin Lab at UCSF and University of Toronto is [http://irwinlab.compbio.ucsf.edu here]. Lab members can read more here. Category:Org..."</p>
<hr />
<div>The Irwin Lab at [[UCSF]] and [[University of Toronto]] is<br />
[http://irwinlab.compbio.ucsf.edu here].<br />
<br />
Lab members can read more [[Welcome group members |here]].<br />
<br />
[[Category:Organizations]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Main_Page&diff=8794Main Page2015-06-27T18:30:12Z<p>Teague Sterling: </p>
<hr />
<div><div id="mainpage"></div><!--<br />
-->__NOTOC__<br />
Welcome to [[DISI:About | DISI]], a wiki for computational pharmacology and ligand discovery in the [[Shoichet Lab]], [[Irwin Lab]] and docking.org. If you find this site confusing at first, DON'T PANIC! It has several different purposes and serves several different [[:Category:Roles | constituencies]]. Please [[Contribute |work with us and help us improve]] it. If you are new and not a member of the lab, you might try [[Welcome web user]].<br />
<br />
== So, Why are you here? ==<br />
You may be interested in one of the main research areas of the lab, among which are:<br />
* [[:Category:Docking|Molecular Docking and ligand discovery]]<br />
* [[:Category:Systems pharmacology | Systems pharmacology]]<br />
* [[:Category:Aggregation| Colloidal aggregation of small molecules]]<br />
Or try one of the other [[:Category:Topic | topics]] of this wiki.<br />
<br />
== OK, so who are you? ==<br />
You may browse based on who you are:<br />
* [[Welcome group members | Lab members, and lab guests with ssh access]]<br />
* [[Welcome web user | Everyone else ]]<br />
<br />
== What sort of question do you have? ==<br />
This site is also organized by the types of questions the article aims to answer:<br />
<br />
=== [[:Category:Manual | WHAT ]] ===<br />
Manuals for using lab software and tools, descriptions of programs, packages, websites, etc.<br />
<br />
=== [[:Category:Tutorials | HOW ]] ===<br />
Tutorials and step-by-step instructions of specific use cases.<br />
<br />
=== [[:Category:Theory | WHY ]] ===<br />
Theory and conceptional explanations of techniques. These often not tied to a single program, package, or website and may differ from reality.<br />
<br />
=== [[:Category:Article_type | Other ]] ===<br />
Everything else<br />
<br />
== Still haven't found what you're looking for? ==<br />
Try the search bar top right to see if that works...<br />
<br />
[[Category:Info]]<br />
[[Category:Organization]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Main_Page&diff=8687Main Page2015-06-03T16:33:52Z<p>Teague Sterling: /* Type of Article */</p>
<hr />
<div><div id="mainpage"></div><!--<br />
-->__NOTOC__<br />
Welcome to [[DISI:About | DISI]], a wiki for computational pharmacology and ligand discovery. If you find this site confusing at first, DON'T PANIC! It has several different purposes and aims to serve several different constituencies. Please [[Contribute |work with us and help us improve]] it. Should you prefer the OLD MAIN PAGE, it is [[Old_Main | still available]]. Also, the [[Welcome web user]] page looks a lot like the old main page.<br />
<br />
== So, Why are you here? ==<br />
You may be interested in one of the main research areas of the lab. These are:<br />
* [[:Category:Docking|Molecular Docking and ligand discovery]]<br />
* [[:Category:Systems pharmacology | Systems pharmacology]]<br />
* [[:Category:Aggregation| Colloidal aggregation of small molecules]]<br />
or you may be interested in one of the other [[:Category:Topic | topics]] covered on this wiki.<br />
<br />
== OK, so who are you? ==<br />
You may prefer to browse this site based on who you are:<br />
* [[Welcome group members | Lab members, and lab guests with ssh access]]<br />
* [[Welcome web user | Everyone else ]]<br />
<br />
== Type of Article ==<br />
This site is also organized by the types of questions the article aims to answer:<br />
<br />
=== [[:Category:Manual | WHAT ]] ===<br />
Manuals for using lab software and tools, descriptions of programs, packages, websites, etc.<br />
<br />
=== [[:Category:Tutorials | HOW ]] ===<br />
Tutorials and step-by-step instructions of specific use cases.<br />
<br />
=== [[:Category:Theory | WHY ]] ===<br />
<br />
Theory and conceptional explanations of techniques. These often not tied to a single program, package, or website and may differ from reality.<br />
<br />
=== [[:Category:Article_type | Other ]] ===<br />
Everything else<br />
<br />
You may also just use the search bar at the top right to try to find what you are looking for...<br />
<br />
[[Category:Info]]<br />
[[Category:Organization]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Main_Page&diff=8686Main Page2015-06-03T16:33:26Z<p>Teague Sterling: /* Type of Article */</p>
<hr />
<div><div id="mainpage"></div><!--<br />
-->__NOTOC__<br />
Welcome to [[DISI:About | DISI]], a wiki for computational pharmacology and ligand discovery. If you find this site confusing at first, DON'T PANIC! It has several different purposes and aims to serve several different constituencies. Please [[Contribute |work with us and help us improve]] it. Should you prefer the OLD MAIN PAGE, it is [[Old_Main | still available]]. Also, the [[Welcome web user]] page looks a lot like the old main page.<br />
<br />
== So, Why are you here? ==<br />
You may be interested in one of the main research areas of the lab. These are:<br />
* [[:Category:Docking|Molecular Docking and ligand discovery]]<br />
* [[:Category:Systems pharmacology | Systems pharmacology]]<br />
* [[:Category:Aggregation| Colloidal aggregation of small molecules]]<br />
or you may be interested in one of the other [[:Category:Topic | topics]] covered on this wiki.<br />
<br />
== OK, so who are you? ==<br />
You may prefer to browse this site based on who you are:<br />
* [[Welcome group members | Lab members, and lab guests with ssh access]]<br />
* [[Welcome web user | Everyone else ]]<br />
<br />
== Type of Article ==<br />
This site is also organized by the types of questions the article aims to answer:<br />
<br />
=== [[:Category:Manual | WHAT ]] ===<br />
Manuals for using lab software and tools, descriptions of programs, packages, websites, etc.<br />
<br />
<br />
=== [[:Category:Tutorials | HOW ]] ===<br />
Tutorials and step-by-step instructions of specific use cases.<br />
<br />
<br />
=== [[:Category:Theory | WHY ]] ===<br />
<br />
Theory and conceptional explanations of techniques. These often not tied to a single program, package, or website and may differ from reality.<br />
<br />
<br />
=== [[:Category:Article_type | Other ]] ===<br />
Everything else<br />
<br />
You may also just use the search bar at the top right to try to find what you are looking for...<br />
<br />
[[Category:Info]]<br />
[[Category:Organization]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Best:Desktop_Usage&diff=8681Best:Desktop Usage2015-06-03T16:29:18Z<p>Teague Sterling: Creating</p>
<hr />
<div>== Disk Encryption ==<br />
<br />
<br />
== Virtual Machine ==<br />
<br />
<br />
== Backups ==<br />
<br />
<br />
== Home Directories ==</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Welcome_group_members&diff=8678Welcome group members2015-06-03T16:27:54Z<p>Teague Sterling: </p>
<hr />
<div>Welcome to the lab! This page organizes topics of interest to group members, our collaborators and anyone else who wishes to access our cluster via ssh. If you are not a member of the lab, please see [[Welcome web user]]. If you have ssh access, [[:Category:Internal]] articles may be of interest to you.<br />
<br />
=== [[New Lab Members]] ===<br />
Please read this first to get started using the cluster!<br />
<br />
{{TOCright}}<br />
<br />
= Info =<br />
* [[Group Meeting]]<br />
* [[Reimbursement Instructions]]<br />
* [[Lab Security Policy]]<br />
* [[Disk space policy]]<br />
* [[Backups]]<br />
* [[Tutorials]]<br />
<br />
= Best Practices = <br />
* [[Best:SGE Usage]]<br />
* [[Best:Disk Usage]]<br />
* [[Best:Desktop Usage]]<br />
* [[Best:More]]<br />
<br />
= Computers =<br />
Please see our [[:Category:Cluster | cluster]] page.<br />
You might be interested to know about the [[Disk types | types of disk]] we currently support.<br />
<br />
In Cluster 2, you log in to sgehead.ucsf.bkslab.org aka gimel.compbio.ucsf.edu. If you need fortran, ssh to fortran. If you need ppilot, ssh to ppilot. You should not need to log in to any other machine.<br />
<br />
In Cluster 0, you log in to sgehead.bkslab.org. You should be able to do everything from there.<br />
<br />
= Access from home = <br />
* [[How to generate ssh keys securely]]<br />
* [[How_to_create_a_vpn/ssh_tunnel]]<br />
<br />
= Good habits = <br />
Set aside a quiet hour once a month to review your disk usage, and compress, delete or move excessive usage. There are disk quotas and we do monitor them. We will make additional space available for needed lab work, but we need your cooperation to keep the signal to noise ratio high. Ask us for a [[Personal backup disk]].<br />
<br />
= Synchrotron trips = <br />
We can help arrange to store your data in a safe place. Ask before you leave.<br />
<br />
= Third party software and databases = <br />
See the main article on [[:Category:Third party software | third party software]] that we maintain on our cluster.<br />
<br />
= Updates = <br />
Some software must be updated or at least attended to annually due to license expiry. Otherwise, we generally update software on an as-needed basis. If you want something updated, ask us and give us some time. Even better, if you can take the lead, it will get done faster almost for sure.<br />
<br />
= When you leave the lab = <br />
Please discuss with John one month before you leave the lab to agree on which of your files will remain, and where they will live.<br />
We can provide you with portable disks so you can take software and data with you, if you like. If you want to take the lab software with you, we can help with that too.<br />
<br />
= FOTL: Friends of the lab = <br />
* [[Travel Depth]], [[qnifft DOCK 3.6 conversion| QNIFFT]] and [http://crystal.med.upenn.edu/software.html related software] for structural analysis from the [http://crystal.med.upenn.edu/ Sharp lab]<br />
* [[PLOP]] - protein modeling program from the [http://francisco.compbio.ucsf.edu/~jacobson/ Jacobson group].<br />
* [[Modeller]] - comparative modeling program from the [http://salilab.org Sali Group]. <br />
* Software vendors: [[OpenEye]], [[ChemAxon]], [[Molinspiration]]<br />
* Software providers: [[rdkit]], [[Knime]]<br />
* [http://zinc.docking.org/browse/catalogs/purchasable.php Compound vendors] and [http://zinc.docking.org/browse/catalogs/annotated.php Annotated catalog providers]<br />
* [[Contract Research Organizations]]<br />
* Transformative databases: [[DrugBank]], [[HMDB]], [[ChEMBL]].<br />
<br />
= Special pages for certain people = <br />
* [[:Category:Internal]] - ssh-level access for group members, lab visitors, collaborators<br />
* [[:Category:Sysadmin]] - cluster creator / software installer / system administrator<br />
* [[:Category:Developer]] - github user / software developer<br />
* [[:Category:Curator]] - database curator<br />
<br />
[[Category:Internal]]<br />
[[Category:FAQ]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Disk_Quotas&diff=8512Disk Quotas2015-04-27T16:55:18Z<p>Teague Sterling: Including shares</p>
<hr />
<div>Quota management on (new) XFS disks (Including /nfs/work and /raidb):<br />
<br />
<br />
xfs_quota -xc 'limit bsoft=1t bhard=1.5t teague' /raidb<br />
xfs_quota -xc report /raidb<br />
xfs_quota -xc help<br />
<br />
<br />
Also see the website: [https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/xfsquota.html]<br />
<br />
[[Category:Sysadmin]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=PuppetTricks&diff=8401PuppetTricks2015-03-31T17:52:13Z<p>Teague Sterling: /* On Server (within 60 seconds) */</p>
<hr />
<div>This page is a collection of tricks and tips for using Puppet to administer systems.<br />
<br />
The names '''master''', '''puppetmaster''', and '''foreman''' all refer to (at the time of writing this) alpha. The name '''client''' refers to any machine that is maintained by puppet.<br />
<br />
<br />
== Regenerating a Certificate ==<br />
<br />
=== On Client ===<br />
$ sudo servce puppet stop<br />
$ sudo mv /var/lib/puppet/ssl /var/lib/puppet/ssl~<br />
$ puppet agent --no-daemonize --onetime --verbose --waitforcert=60<br />
<br />
=== On Server (within 60 seconds) ===<br />
$ sudo puppet cert clean <client hostname><br />
$ sudo puppet cert sign <client hostname><br />
'''OR''' if you wish to allow DNS aliases<br />
$ sudo puppet cert --allow-dns-alt-names sign <client hostname><br />
<br />
Note this can also be done through Foreman by going to the [https://foreman.ucsf.bkslab.org/smart_proxies/1-puppetmaster-cluster-ucsf-bkslab-org/puppetca Infrastructure -> Smart Proxies -> Puppetmaster -> Certificates page]<br />
<br />
=== On Client ===<br />
The previous run should finish without errors (errors are in purple). It should then be possible to run `sudo puppet agent -t` without any waiting or errors.</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=PuppetTricks&diff=8400PuppetTricks2015-03-31T17:51:02Z<p>Teague Sterling: /* On Client */</p>
<hr />
<div>This page is a collection of tricks and tips for using Puppet to administer systems.<br />
<br />
The names '''master''', '''puppetmaster''', and '''foreman''' all refer to (at the time of writing this) alpha. The name '''client''' refers to any machine that is maintained by puppet.<br />
<br />
<br />
== Regenerating a Certificate ==<br />
<br />
=== On Client ===<br />
$ sudo servce puppet stop<br />
$ sudo mv /var/lib/puppet/ssl /var/lib/puppet/ssl~<br />
$ puppet agent --no-daemonize --onetime --verbose --waitforcert=60<br />
<br />
=== On Server (within 60 seconds) ===<br />
$ sudo puppet cert clean <client hostname><br />
$ sudo puppet cert sign <client hostname><br />
'''OR''' if you wish to allow DNS aliases<br />
$ sudo puppet cert --allow-dns-alt-names sign <client hostname><br />
<br />
Note this can also be done through Foreman by going to the [Infrastructure -> Smart Proxies -> Puppetmaster -> Certificates page](https://foreman.ucsf.bkslab.org/smart_proxies/1-puppetmaster-cluster-ucsf-bkslab-org/puppetca)<br />
<br />
=== On Client ===<br />
The previous run should finish without errors (errors are in purple). It should then be possible to run `sudo puppet agent -t` without any waiting or errors.</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=PuppetTricks&diff=8399PuppetTricks2015-03-31T17:50:52Z<p>Teague Sterling: Created page with "This page is a collection of tricks and tips for using Puppet to administer systems. The names '''master''', '''puppetmaster''', and '''foreman''' all refer to (at the time o..."</p>
<hr />
<div>This page is a collection of tricks and tips for using Puppet to administer systems.<br />
<br />
The names '''master''', '''puppetmaster''', and '''foreman''' all refer to (at the time of writing this) alpha. The name '''client''' refers to any machine that is maintained by puppet.<br />
<br />
<br />
== Regenerating a Certificate ==<br />
<br />
=== On Client ===<br />
$ sudo servce puppet stop<br />
$ sudo mv /var/lib/puppet/ssl /var/lib/puppet/ssl~<br />
$ puppet agent --no-daemonize --onetime --verbose --waitforcert=60<br />
<br />
=== On Server (within 60 seconds) ===<br />
$ sudo puppet cert clean <client hostname><br />
$ sudo puppet cert sign <client hostname><br />
'''OR''' if you wish to allow DNS aliases<br />
$ sudo puppet cert --allow-dns-alt-names sign <client hostname><br />
<br />
Note this can also be done through Foreman by going to the [Infrastructure -> Smart Proxies -> Puppetmaster -> Certificates page](https://foreman.ucsf.bkslab.org/smart_proxies/1-puppetmaster-cluster-ucsf-bkslab-org/puppetca)<br />
<br />
=== On Client ===<br />
The previous run should finish without errors (errors are in purple). It should then be possible to run `sudo puppet agent -t` without any waiting or errors.</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Backups&diff=7893Backups2014-06-03T20:46:00Z<p>Teague Sterling: /* Quarterly backups, starting first day of quarter */</p>
<hr />
<div>= Policy = <br />
<br />
{{TOCright}}<br />
There are at least three ways to make backups: <br />
* a) We give you a 4TB portable drive, you keep it and write to it. Ask for one. <br />
* b) Use CrashPlan/rsync and/or DropBox/Sync/Box.com to create backups in the cloud.<br />
* c) We backup major cluster disks to tape, as follows:<br />
<br />
== Weekly backups, starting Friday night == <br />
* Cluster 2: dalet:/srv/home/ ('''/nfs/home''')<br />
<br />
== Quarterly backups, starting first day of quarter == <br />
* Cluster 2: bet:/srv/store ('''/nfs/store''')<br />
* Cluster 2: bet:/srv/soft ('''/nfs/soft''')<br />
* Cluster 2: shin:/db ('''/nfs/db''')*<br />
* Cluster 2: aleph:/var/lib/libvirt/images<br />
* Cluster 2: tet:/var/lib/libvirt/images<br />
* Open-E DSS requires special backup consideration<br />
<br />
== Never backed up (to tape) == <br />
* Cluster 2: bet:/srv/work ('''/nfs/work''')<br />
* Laptops<br />
* Desktops (Backups can be configured locally)<br />
* No other user files are backed up, ever, unless by special request and agreed by email exchange.<br />
<br />
== Tape rotation and length of historical data ==<br />
* We have four savepacks (sets of tapes, in a suitcase) that we rotate for /nfs/home. <br />
* We set aside one of these savepacks quarterly, on the first backup of the quarter, as an offsite backup.<br />
* Thus at any time there is a weekly backup made starting Friday night for the last four weeks, and a quarterly backup for the last four quarters.<br />
* We also pull savepacks out of rotation from time to time to provide additional long-term backups.<br />
<br />
= Note =<br />
At time of writing, this policy is only implemented on [[Cluster 2]]. Select backups of [[Cluster 0]] will follow soon.<br />
<br />
= Restore = <br />
To have files restored from backup, please write to the [[sysadmin]]s, stating file/directory name, the approximate time of when it was created in the form you want restored and when that copy was deleted or damaged.<br />
<br />
= See Also = <br />
* [[Disk space policy]]<br />
<br />
Return to [[:Category:Sysadmin | system administrator's guide]].<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Internal]]<br />
[[Category:Policy]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Backups&diff=7892Backups2014-06-03T20:45:38Z<p>Teague Sterling: /* Weekly backups, starting Friday night */</p>
<hr />
<div>= Policy = <br />
<br />
{{TOCright}}<br />
There are at least three ways to make backups: <br />
* a) We give you a 4TB portable drive, you keep it and write to it. Ask for one. <br />
* b) Use CrashPlan/rsync and/or DropBox/Sync/Box.com to create backups in the cloud.<br />
* c) We backup major cluster disks to tape, as follows:<br />
<br />
== Weekly backups, starting Friday night == <br />
* Cluster 2: dalet:/srv/home/ ('''/nfs/home''')<br />
<br />
== Quarterly backups, starting first day of quarter == <br />
* Cluster 2: shin:/db ('''/nfs/db''')*<br />
* Cluster 2: bet:/srv/store ('''/nfs/store''')<br />
* Cluster 2: aleph:/var/lib/libvirt/images<br />
* Cluster 2: tet:/var/lib/libvirt/images<br />
* Open-E DSS requires special backup consideration<br />
<br />
== Never backed up (to tape) == <br />
* Cluster 2: bet:/srv/work ('''/nfs/work''')<br />
* Laptops<br />
* Desktops (Backups can be configured locally)<br />
* No other user files are backed up, ever, unless by special request and agreed by email exchange.<br />
<br />
== Tape rotation and length of historical data ==<br />
* We have four savepacks (sets of tapes, in a suitcase) that we rotate for /nfs/home. <br />
* We set aside one of these savepacks quarterly, on the first backup of the quarter, as an offsite backup.<br />
* Thus at any time there is a weekly backup made starting Friday night for the last four weeks, and a quarterly backup for the last four quarters.<br />
* We also pull savepacks out of rotation from time to time to provide additional long-term backups.<br />
<br />
= Note =<br />
At time of writing, this policy is only implemented on [[Cluster 2]]. Select backups of [[Cluster 0]] will follow soon.<br />
<br />
= Restore = <br />
To have files restored from backup, please write to the [[sysadmin]]s, stating file/directory name, the approximate time of when it was created in the form you want restored and when that copy was deleted or damaged.<br />
<br />
= See Also = <br />
* [[Disk space policy]]<br />
<br />
Return to [[:Category:Sysadmin | system administrator's guide]].<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Internal]]<br />
[[Category:Policy]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Backups&diff=7891Backups2014-06-03T20:45:25Z<p>Teague Sterling: /* Quarterly backups, starting first day of quarter */</p>
<hr />
<div>= Policy = <br />
<br />
{{TOCright}}<br />
There are at least three ways to make backups: <br />
* a) We give you a 4TB portable drive, you keep it and write to it. Ask for one. <br />
* b) Use CrashPlan/rsync and/or DropBox/Sync/Box.com to create backups in the cloud.<br />
* c) We backup major cluster disks to tape, as follows:<br />
<br />
== Weekly backups, starting Friday night == <br />
* Cluster2: dalet:/srv/home/ ('''/nfs/home''')<br />
<br />
== Quarterly backups, starting first day of quarter == <br />
* Cluster 2: shin:/db ('''/nfs/db''')*<br />
* Cluster 2: bet:/srv/store ('''/nfs/store''')<br />
* Cluster 2: aleph:/var/lib/libvirt/images<br />
* Cluster 2: tet:/var/lib/libvirt/images<br />
* Open-E DSS requires special backup consideration<br />
<br />
== Never backed up (to tape) == <br />
* Cluster 2: bet:/srv/work ('''/nfs/work''')<br />
* Laptops<br />
* Desktops (Backups can be configured locally)<br />
* No other user files are backed up, ever, unless by special request and agreed by email exchange.<br />
<br />
== Tape rotation and length of historical data ==<br />
* We have four savepacks (sets of tapes, in a suitcase) that we rotate for /nfs/home. <br />
* We set aside one of these savepacks quarterly, on the first backup of the quarter, as an offsite backup.<br />
* Thus at any time there is a weekly backup made starting Friday night for the last four weeks, and a quarterly backup for the last four quarters.<br />
* We also pull savepacks out of rotation from time to time to provide additional long-term backups.<br />
<br />
= Note =<br />
At time of writing, this policy is only implemented on [[Cluster 2]]. Select backups of [[Cluster 0]] will follow soon.<br />
<br />
= Restore = <br />
To have files restored from backup, please write to the [[sysadmin]]s, stating file/directory name, the approximate time of when it was created in the form you want restored and when that copy was deleted or damaged.<br />
<br />
= See Also = <br />
* [[Disk space policy]]<br />
<br />
Return to [[:Category:Sysadmin | system administrator's guide]].<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Internal]]<br />
[[Category:Policy]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Backups&diff=7890Backups2014-06-03T20:44:38Z<p>Teague Sterling: /* Never backed up */</p>
<hr />
<div>= Policy = <br />
<br />
{{TOCright}}<br />
There are at least three ways to make backups: <br />
* a) We give you a 4TB portable drive, you keep it and write to it. Ask for one. <br />
* b) Use CrashPlan/rsync and/or DropBox/Sync/Box.com to create backups in the cloud.<br />
* c) We backup major cluster disks to tape, as follows:<br />
<br />
== Weekly backups, starting Friday night == <br />
* Cluster2: dalet:/srv/home/ ('''/nfs/home''')<br />
<br />
== Quarterly backups, starting first day of quarter == <br />
* Cluster 2: shin:/db ('''/nfs/db''')*<br />
* Cluster 2: bet:/srv/store ('''/nfs/store''')<br />
* Cluster 2: aleph:/var/lib/libvirt/images<br />
* Cluster 2: tet:/var/lib/libvirt/images<br />
<br />
* *Open-E DSS requires special backup consideration<br />
<br />
== Never backed up (to tape) == <br />
* Cluster 2: bet:/srv/work ('''/nfs/work''')<br />
* Laptops<br />
* Desktops (Backups can be configured locally)<br />
* No other user files are backed up, ever, unless by special request and agreed by email exchange.<br />
<br />
== Tape rotation and length of historical data ==<br />
* We have four savepacks (sets of tapes, in a suitcase) that we rotate for /nfs/home. <br />
* We set aside one of these savepacks quarterly, on the first backup of the quarter, as an offsite backup.<br />
* Thus at any time there is a weekly backup made starting Friday night for the last four weeks, and a quarterly backup for the last four quarters.<br />
* We also pull savepacks out of rotation from time to time to provide additional long-term backups.<br />
<br />
= Note =<br />
At time of writing, this policy is only implemented on [[Cluster 2]]. Select backups of [[Cluster 0]] will follow soon.<br />
<br />
= Restore = <br />
To have files restored from backup, please write to the [[sysadmin]]s, stating file/directory name, the approximate time of when it was created in the form you want restored and when that copy was deleted or damaged.<br />
<br />
= See Also = <br />
* [[Disk space policy]]<br />
<br />
Return to [[:Category:Sysadmin | system administrator's guide]].<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Internal]]<br />
[[Category:Policy]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Backups&diff=7889Backups2014-06-03T20:44:01Z<p>Teague Sterling: /* Weekly backups, starting Friday night */</p>
<hr />
<div>= Policy = <br />
<br />
{{TOCright}}<br />
There are at least three ways to make backups: <br />
* a) We give you a 4TB portable drive, you keep it and write to it. Ask for one. <br />
* b) Use CrashPlan/rsync and/or DropBox/Sync/Box.com to create backups in the cloud.<br />
* c) We backup major cluster disks to tape, as follows:<br />
<br />
== Weekly backups, starting Friday night == <br />
* Cluster2: dalet:/srv/home/ ('''/nfs/home''')<br />
<br />
== Quarterly backups, starting first day of quarter == <br />
* Cluster 2: shin:/db ('''/nfs/db''')*<br />
* Cluster 2: bet:/srv/store ('''/nfs/store''')<br />
* Cluster 2: aleph:/var/lib/libvirt/images<br />
* Cluster 2: tet:/var/lib/libvirt/images<br />
<br />
* *Open-E DSS requires special backup consideration<br />
<br />
== Never backed up == <br />
* Cluster 2: /nfs/work<br />
* Laptops<br />
* Desktops<br />
* No other user files are backed up, ever, unless by special request and agreed by email exchange.<br />
<br />
== Tape rotation and length of historical data ==<br />
* We have four savepacks (sets of tapes, in a suitcase) that we rotate for /nfs/home. <br />
* We set aside one of these savepacks quarterly, on the first backup of the quarter, as an offsite backup.<br />
* Thus at any time there is a weekly backup made starting Friday night for the last four weeks, and a quarterly backup for the last four quarters.<br />
* We also pull savepacks out of rotation from time to time to provide additional long-term backups.<br />
<br />
= Note =<br />
At time of writing, this policy is only implemented on [[Cluster 2]]. Select backups of [[Cluster 0]] will follow soon.<br />
<br />
= Restore = <br />
To have files restored from backup, please write to the [[sysadmin]]s, stating file/directory name, the approximate time of when it was created in the form you want restored and when that copy was deleted or damaged.<br />
<br />
= See Also = <br />
* [[Disk space policy]]<br />
<br />
Return to [[:Category:Sysadmin | system administrator's guide]].<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Internal]]<br />
[[Category:Policy]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Backups&diff=7888Backups2014-06-03T20:43:45Z<p>Teague Sterling: /* Quarterly backups, starting first day of quarter */</p>
<hr />
<div>= Policy = <br />
<br />
{{TOCright}}<br />
There are at least three ways to make backups: <br />
* a) We give you a 4TB portable drive, you keep it and write to it. Ask for one. <br />
* b) Use CrashPlan/rsync and/or DropBox/Sync/Box.com to create backups in the cloud.<br />
* c) We backup major cluster disks to tape, as follows:<br />
<br />
== Weekly backups, starting Friday night == <br />
* Cluster2: dalet:/srv/nfs/home/<br />
<br />
== Quarterly backups, starting first day of quarter == <br />
* Cluster 2: shin:/db ('''/nfs/db''')*<br />
* Cluster 2: bet:/srv/store ('''/nfs/store''')<br />
* Cluster 2: aleph:/var/lib/libvirt/images<br />
* Cluster 2: tet:/var/lib/libvirt/images<br />
<br />
* *Open-E DSS requires special backup consideration<br />
<br />
== Never backed up == <br />
* Cluster 2: /nfs/work<br />
* Laptops<br />
* Desktops<br />
* No other user files are backed up, ever, unless by special request and agreed by email exchange.<br />
<br />
== Tape rotation and length of historical data ==<br />
* We have four savepacks (sets of tapes, in a suitcase) that we rotate for /nfs/home. <br />
* We set aside one of these savepacks quarterly, on the first backup of the quarter, as an offsite backup.<br />
* Thus at any time there is a weekly backup made starting Friday night for the last four weeks, and a quarterly backup for the last four quarters.<br />
* We also pull savepacks out of rotation from time to time to provide additional long-term backups.<br />
<br />
= Note =<br />
At time of writing, this policy is only implemented on [[Cluster 2]]. Select backups of [[Cluster 0]] will follow soon.<br />
<br />
= Restore = <br />
To have files restored from backup, please write to the [[sysadmin]]s, stating file/directory name, the approximate time of when it was created in the form you want restored and when that copy was deleted or damaged.<br />
<br />
= See Also = <br />
* [[Disk space policy]]<br />
<br />
Return to [[:Category:Sysadmin | system administrator's guide]].<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Internal]]<br />
[[Category:Policy]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Backups&diff=7887Backups2014-06-03T20:43:03Z<p>Teague Sterling: /* Weekly backups, starting Friday night */</p>
<hr />
<div>= Policy = <br />
<br />
{{TOCright}}<br />
There are at least three ways to make backups: <br />
* a) We give you a 4TB portable drive, you keep it and write to it. Ask for one. <br />
* b) Use CrashPlan/rsync and/or DropBox/Sync/Box.com to create backups in the cloud.<br />
* c) We backup major cluster disks to tape, as follows:<br />
<br />
== Weekly backups, starting Friday night == <br />
* Cluster2: dalet:/srv/nfs/home/<br />
<br />
== Quarterly backups, starting first day of quarter == <br />
* Cluster 2: shin:/nfs/db *<br />
* Cluster 2: bet:/nfs/store<br />
* Cluster 2: aleph:/var/lib/libvirt/images<br />
* Cluster 2: tet:/var/lib/libvirt/images<br />
<br />
* *Open-E DSS requires special backup consideration<br />
<br />
== Never backed up == <br />
* Cluster 2: /nfs/work<br />
* Laptops<br />
* Desktops<br />
* No other user files are backed up, ever, unless by special request and agreed by email exchange.<br />
<br />
== Tape rotation and length of historical data ==<br />
* We have four savepacks (sets of tapes, in a suitcase) that we rotate for /nfs/home. <br />
* We set aside one of these savepacks quarterly, on the first backup of the quarter, as an offsite backup.<br />
* Thus at any time there is a weekly backup made starting Friday night for the last four weeks, and a quarterly backup for the last four quarters.<br />
* We also pull savepacks out of rotation from time to time to provide additional long-term backups.<br />
<br />
= Note =<br />
At time of writing, this policy is only implemented on [[Cluster 2]]. Select backups of [[Cluster 0]] will follow soon.<br />
<br />
= Restore = <br />
To have files restored from backup, please write to the [[sysadmin]]s, stating file/directory name, the approximate time of when it was created in the form you want restored and when that copy was deleted or damaged.<br />
<br />
= See Also = <br />
* [[Disk space policy]]<br />
<br />
Return to [[:Category:Sysadmin | system administrator's guide]].<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Internal]]<br />
[[Category:Policy]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Backups&diff=7886Backups2014-06-03T20:42:49Z<p>Teague Sterling: /* Quarterly backups, starting first day of quarter */</p>
<hr />
<div>= Policy = <br />
<br />
{{TOCright}}<br />
There are at least three ways to make backups: <br />
* a) We give you a 4TB portable drive, you keep it and write to it. Ask for one. <br />
* b) Use CrashPlan/rsync and/or DropBox/Sync/Box.com to create backups in the cloud.<br />
* c) We backup major cluster disks to tape, as follows:<br />
<br />
== Weekly backups, starting Friday night == <br />
* Cluster2: /nfs/home/<br />
== Quarterly backups, starting first day of quarter == <br />
* Cluster 2: shin:/nfs/db *<br />
* Cluster 2: bet:/nfs/store<br />
* Cluster 2: aleph:/var/lib/libvirt/images<br />
* Cluster 2: tet:/var/lib/libvirt/images<br />
<br />
* *Open-E DSS requires special backup consideration<br />
<br />
== Never backed up == <br />
* Cluster 2: /nfs/work<br />
* Laptops<br />
* Desktops<br />
* No other user files are backed up, ever, unless by special request and agreed by email exchange.<br />
<br />
== Tape rotation and length of historical data ==<br />
* We have four savepacks (sets of tapes, in a suitcase) that we rotate for /nfs/home. <br />
* We set aside one of these savepacks quarterly, on the first backup of the quarter, as an offsite backup.<br />
* Thus at any time there is a weekly backup made starting Friday night for the last four weeks, and a quarterly backup for the last four quarters.<br />
* We also pull savepacks out of rotation from time to time to provide additional long-term backups.<br />
<br />
= Note =<br />
At time of writing, this policy is only implemented on [[Cluster 2]]. Select backups of [[Cluster 0]] will follow soon.<br />
<br />
= Restore = <br />
To have files restored from backup, please write to the [[sysadmin]]s, stating file/directory name, the approximate time of when it was created in the form you want restored and when that copy was deleted or damaged.<br />
<br />
= See Also = <br />
* [[Disk space policy]]<br />
<br />
Return to [[:Category:Sysadmin | system administrator's guide]].<br />
<br />
[[Category:Sysadmin]]<br />
[[Category:Internal]]<br />
[[Category:Policy]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Disk_space_policy&diff=7885Disk space policy2014-06-03T17:08:52Z<p>Teague Sterling: /* Changing quotas from hard limits to soft limits and fixing * bullet point */</p>
<hr />
<div>This is the lab disk space policy and applies to those with ssh access to our cluster only. This policy currently applies to [[cluster 2]]. It will apply to [[cluster 0]] with caveats by June 1, 2014. <br />
We ask you to observe the following disk usage policy for your account on the cluster. We think this policy will cover 90% of the users, 90% of the time. We do understand that research can have unanticipated needs, and we are willing to work with you to create a plan that works for everyone.<br />
<br />
{{TOCright}}<br />
<br />
= Checking Disk Usage =<br />
You can check your disk usage on the cluster NFS (excluding /nfs/db) drives by running the '''quota''' command on any cluster machine.<br />
<br />
Example Output:<br />
<br />
[teague@gimel ~]$ quota -s<br />
Disk quotas for user teague (uid 42001): <br />
Filesystem blocks quota limit grace files quota limit grace<br />
nfs_home:/export/home/<br />
321M 500G 512G 472 0 0 <br />
nfs_work:/export/work/<br />
114M 500G 512G 3 0 0 <br />
nfs_store:/export/store/<br />
4 500G 512G 1 0 0<br />
<br />
= Treatment by disk type =<br />
When you log in to a cluster computer, you are in your home directory. This would normally be /nfs/home/<userid>. Lab tradition is to have a ~/code/ directory for all software and a ~/work/ directory for all of your research. Your home directory is backed up weekly, and has a quota of 500 GB. Keeping the quota at this level will allow us to provide a high performance backup system that runs regularly and completes in a timely fashion.<br />
<br />
== Docking jobs ==<br />
If you require more than your quota for ~/work/, we will create a special directory for you, e.g. /nfs/work/<userid>, which we suggest you symlink to ~/work/. We expect most lab members engaged in multiple docking projects will require space on /nfs/work. /nfs/work will never be backed up. We recommend you repatriate key files from time to time from /nfs/work/<userid> to e.g. ~/worksave/. We can help you write scripts to automate this. You are also welcome to make your own supplementary backups to a USB drive on your workstation or laptop. We have USB drives you may have for this purpose.<br />
<br />
Historical note: in the past, we used to attempt to backup nearly every file on every disk, including millions of docking job output files that had a transient existence. This resulted in slow computers - due to the backup process - and a lack of clarity of what was backed up. It was unsatisfactory. One problem has been clarity: how to understand what is backed up and what is not backed up. By backing up only /nfs/home/<userid> we are forcing you to choose the key input files from which you can re-create your output. You may also wish to use /nfs/store/<userid> to retain important results. But we want to get away from the idea of backing up every file on every disk, particularly those of transient existence and marginal value.<br />
<br />
== Dockable database files ==<br />
If you create dockable database files that persist and are larger than your quota, ask us to create dedicated space for you, e.g. /nfs/db/<userid>. The hallmarks of database files is that they are large, they are often used by more than one person, and they are persistent. Database files are backed up quarterly beginning on the first day of the quarter.<br />
<br />
<br />
[[Category:Policy]]<br />
[[Category:Internal]]<br />
<br />
== Large datasets, completed projects ==<br />
If you have large files from the synchrotron or elsewhere, or a project has finished but you want to keep it online, ask us to create dedicated space for you, e.g. /nfs/store/<userid>. Storage files are backed up quarterly beginning on the first day of the quarter. <br />
<br />
If after using /nfs/db, /nfs/work and /nfs/store to offload from /nfs/home/ you still need more space, please work with the [[sysadmin]]s to help you.<br />
<br />
= Workstation and Laptop =<br />
When you log in to your workstation, you may be in a local directory on the workstation. Our policy is to never back up workstations. You may back up your own workstation, and you may make additional backups of your data on the cluster by writing to USB disks mounted on your desktop or laptop. We offer as a parting gift from the lab two multi-TB drives to which you may copy your files. Make one copy to take with you and one copy for us to keep safe for you. Label them clearly in ink. Then, delete the files from the server as you leave the lab, provided the project has been published.<br />
<br />
= Summary = <br />
We back up home directories weekly, and selected database, crystallography and other archived files quarterly. We do not back up /nfs/work files and we do not back up workstations or laptops. To backup files from /nfs/work, you must repatriate the key files to your home directory whence they will be backed up as part of normal procedures. To backup laptop and desktop, use USB disks, which we can give you.<br />
<br />
Our backup systems cannot keep up with the growth in disk space and the voracious appetite for disk of group members. This is a pragmatic policy that, while requiring a little organization, is more likely to deliver restored files you can actually rely on.<br />
<br />
As always, we welcome [[feedback]].<br />
<br />
{| class="wikitable"<br />
|-<br />
! Name !! Location !! Quota !! Backup? !! Usage <br />
|-<br />
| Home || /nfs/home/<userid> || 500 GB || Weekly || General use - not large database and not docking runs. <br />
|-<br />
| Work || /nfs/work/<userid> || 500GB* || Never || Docking runs or calculations that produce large output<br />
|- <br />
| DB || /nfs/db/<userid> || 2TB* || Quarterly || Database files, often persistent, rarely modifed<br />
|-<br />
| Store || /nfs/store/<userid> || 500GB* || Quarterly || Crystallography files or other archived files, rarely modified.<br />
|-<br />
| Desktop || /home/<userid> || N/A || Never || Copy critical files from desktop to home directory on server.<br />
|-<br />
| Laptop || / || N/A || Never || Copy critical files from laptop to home directory on server.<br />
|}<br />
* * More space may be requested from the [[sysadmin]]s.<br />
<br />
= See Also = <br />
* [[Backups]]<br />
<br />
<br />
<br />
[[Category:Internal]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Disk_space_policy&diff=7884Disk space policy2014-05-30T17:16:37Z<p>Teague Sterling: </p>
<hr />
<div>This is the lab disk space policy and applies to those with ssh access to our cluster only. This policy currently applies to [[cluster 2]]. It will apply to [[cluster 0]] with caveats by June 1, 2014. <br />
We ask you to observe the following disk usage policy for your account on the cluster. We think this policy will cover 90% of the users, 90% of the time. We do understand that research can have unanticipated needs, and we are willing to work with you to create a plan that works for everyone.<br />
<br />
{{TOCright}}<br />
<br />
= Checking Disk Usage =<br />
You can check your disk usage on the cluster NFS (excluding /nfs/db) drives by running the '''quota''' command on any cluster machine.<br />
<br />
Example Output:<br />
<br />
[teague@gimel ~]$ quota -s<br />
Disk quotas for user teague (uid 42001): <br />
Filesystem blocks quota limit grace files quota limit grace<br />
nfs_home:/export/home/<br />
321M 500G 512G 472 0 0 <br />
nfs_work:/export/work/<br />
114M 500G 512G 3 0 0 <br />
nfs_store:/export/store/<br />
4 500G 512G 1 0 0<br />
<br />
= Treatment by disk type =<br />
When you log in to a cluster computer, you are in your home directory. This would normally be /nfs/home/<userid>. Lab tradition is to have a ~/code/ directory for all software and a ~/work/ directory for all of your research. Your home directory is backed up weekly, and has a quota of 500 GB. Keeping the quota at this level will allow us to provide a high performance backup system that runs regularly and completes in a timely fashion.<br />
<br />
== Docking jobs ==<br />
If you require more than your quota for ~/work/, we will create a special directory for you, e.g. /nfs/work/<userid>, which we suggest you symlink to ~/work/. We expect most lab members engaged in multiple docking projects will require space on /nfs/work. /nfs/work will never be backed up. We recommend you repatriate key files from time to time from /nfs/work/<userid> to e.g. ~/worksave/. We can help you write scripts to automate this. You are also welcome to make your own supplementary backups to a USB drive on your workstation or laptop. We have USB drives you may have for this purpose.<br />
<br />
Historical note: in the past, we used to attempt to backup nearly every file on every disk, including millions of docking job output files that had a transient existence. This resulted in slow computers - due to the backup process - and a lack of clarity of what was backed up. It was unsatisfactory. One problem has been clarity: how to understand what is backed up and what is not backed up. By backing up only /nfs/home/<userid> we are forcing you to choose the key input files from which you can re-create your output. You may also wish to use /nfs/store/<userid> to retain important results. But we want to get away from the idea of backing up every file on every disk, particularly those of transient existence and marginal value.<br />
<br />
== Dockable database files ==<br />
If you create dockable database files that persist and are larger than your quota, ask us to create dedicated space for you, e.g. /nfs/db/<userid>. The hallmarks of database files is that they are large, they are often used by more than one person, and they are persistent. Database files are backed up quarterly beginning on the first day of the quarter.<br />
<br />
<br />
[[Category:Policy]]<br />
[[Category:Internal]]<br />
<br />
== Large datasets, completed projects ==<br />
If you have large files from the synchrotron or elsewhere, or a project has finished but you want to keep it online, ask us to create dedicated space for you, e.g. /nfs/store/<userid>. Storage files are backed up quarterly beginning on the first day of the quarter. <br />
<br />
If after using /nfs/db, /nfs/work and /nfs/store to offload from /nfs/home/ you still need more space, please work with the [[sysadmin]]s to help you.<br />
<br />
= Workstation and Laptop =<br />
When you log in to your workstation, you may be in a local directory on the workstation. Our policy is to never back up workstations. You may back up your own workstation, and you may make additional backups of your data on the cluster by writing to USB disks mounted on your desktop or laptop. We offer as a parting gift from the lab two multi-TB drives to which you may copy your files. Make one copy to take with you and one copy for us to keep safe for you. Label them clearly in ink. Then, delete the files from the server as you leave the lab, provided the project has been published.<br />
<br />
= Summary = <br />
We back up home directories weekly, and selected database, crystallography and other archived files quarterly. We do not back up /nfs/work files and we do not back up workstations or laptops. To backup files from /nfs/work, you must repatriate the key files to your home directory whence they will be backed up as part of normal procedures. To backup laptop and desktop, use USB disks, which we can give you.<br />
<br />
Our backup systems cannot keep up with the growth in disk space and the voracious appetite for disk of group members. This is a pragmatic policy that, while requiring a little organization, is more likely to deliver restored files you can actually rely on.<br />
<br />
As always, we welcome [[feedback]].<br />
<br />
{| class="wikitable"<br />
|-<br />
! Name !! Location !! Quota !! Backup? !! Usage <br />
|-<br />
| Home || /nfs/home/<userid> || 500 GB || Weekly || General use - not large database and not docking runs. <br />
|-<br />
| Work || /nfs/work/<userid> || 512GB* || Never || Docking runs or calculations that produce large output<br />
|- <br />
| DB || /nfs/db/<userid> || 2TB* || Quarterly || Database files, often persistent, rarely modifed<br />
|-<br />
| Store || /nfs/store/<userid> || 512GB* || Quarterly || Crystallography files or other archived files, rarely modified.<br />
|-<br />
| Desktop || /home/<userid> || N/A || Never || Copy critical files from desktop to home directory on server.<br />
|-<br />
| Laptop || / || N/A || Never || Copy critical files from laptop to home directory on server.<br />
|}<br />
* More space may be requested from the [[sysadmin]]s.<br />
<br />
= See Also = <br />
* [[Backups]]<br />
<br />
<br />
<br />
[[Category:Internal]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Disk_space_policy&diff=7883Disk space policy2014-05-30T17:01:33Z<p>Teague Sterling: /* Checking Disk Usage */</p>
<hr />
<div>This is the lab disk space policy and applies to those with ssh access to our cluster only. This policy currently applies to [[cluster 2]]. It will apply to [[cluster 0]] with caveats by June 1, 2014. <br />
We ask you to observe the following disk usage policy for your account on the cluster. We think this policy will cover 90% of the users, 90% of the time. We do understand that research can have unanticipated needs, and we are willing to work with you to create a plan that works for everyone.<br />
<br />
{{TOCright}}<br />
<br />
= Checking Disk Usage =<br />
You can check your disk usage on the cluster NFS (excluding /nfs/db) drives by running the *quota* command on any cluster machine.<br />
<br />
Example Output:<br />
<br />
[teague@gimel ~]$ quota -s<br />
Disk quotas for user teague (uid 42001): <br />
Filesystem blocks quota limit grace files quota limit grace<br />
nfs_home:/export/home/<br />
321M 500G 512G 472 0 0 <br />
nfs_work:/export/work/<br />
114M 500G 512G 3 0 0 <br />
nfs_store:/export/store/<br />
4 500G 512G 1 0 0<br />
<br />
= Treatment by disk type =<br />
When you log in to a cluster computer, you are in your home directory. This would normally be /nfs/home/<userid>. Lab tradition is to have a ~/code/ directory for all software and a ~/work/ directory for all of your research. Your home directory is backed up weekly, and has a quota of 500 GB. Keeping the quota at this level will allow us to provide a high performance backup system that runs regularly and completes in a timely fashion.<br />
<br />
== Docking jobs ==<br />
If you require more than your quota for ~/work/, we will create a special directory for you, e.g. /nfs/work/<userid>, which we suggest you symlink to ~/work/. We expect most lab members engaged in multiple docking projects will require space on /nfs/work. /nfs/work will never be backed up. We recommend you repatriate key files from time to time from /nfs/work/<userid> to e.g. ~/worksave/. We can help you write scripts to automate this. You are also welcome to make your own supplementary backups to a USB drive on your workstation or laptop. We have USB drives you may have for this purpose.<br />
<br />
Historical note: in the past, we used to attempt to backup nearly every file on every disk, including millions of docking job output files that had a transient existence. This resulted in slow computers - due to the backup process - and a lack of clarity of what was backed up. It was unsatisfactory. One problem has been clarity: how to understand what is backed up and what is not backed up. By backing up only /nfs/home/<userid> we are forcing you to choose the key input files from which you can re-create your output. You may also wish to use /nfs/store/<userid> to retain important results. But we want to get away from the idea of backing up every file on every disk, particularly those of transient existence and marginal value.<br />
<br />
== Dockable database files ==<br />
If you create dockable database files that persist and are larger than your quota, ask us to create dedicated space for you, e.g. /nfs/db/<userid>. The hallmarks of database files is that they are large, they are often used by more than one person, and they are persistent. Database files are backed up quarterly beginning on the first day of the quarter.<br />
<br />
<br />
[[Category:Policy]]<br />
[[Category:Internal]]<br />
<br />
== Large datasets, completed projects ==<br />
If you have large files from the synchrotron or elsewhere, or a project has finished but you want to keep it online, ask us to create dedicated space for you, e.g. /nfs/store/<userid>. Storage files are backed up quarterly beginning on the first day of the quarter. <br />
<br />
If after using /nfs/db, /nfs/work and /nfs/store to offload from /nfs/home/ you still need more space, please work with the [[sysadmin]]s to help you.<br />
<br />
= Workstation and Laptop =<br />
When you log in to your workstation, you may be in a local directory on the workstation. Our policy is to never back up workstations. You may back up your own workstation, and you may make additional backups of your data on the cluster by writing to USB disks mounted on your desktop or laptop. We offer as a parting gift from the lab two multi-TB drives to which you may copy your files. Make one copy to take with you and one copy for us to keep safe for you. Label them clearly in ink. Then, delete the files from the server as you leave the lab, provided the project has been published.<br />
<br />
= Summary = <br />
We back up home directories weekly, and selected database, crystallography and other archived files quarterly. We do not back up /nfs/work files and we do not back up workstations or laptops. To backup files from /nfs/work, you must repatriate the key files to your home directory whence they will be backed up as part of normal procedures. To backup laptop and desktop, use USB disks, which we can give you.<br />
<br />
Our backup systems cannot keep up with the growth in disk space and the voracious appetite for disk of group members. This is a pragmatic policy that, while requiring a little organization, is more likely to deliver restored files you can actually rely on.<br />
<br />
As always, we welcome [[feedback]].<br />
<br />
{| class="wikitable"<br />
|-<br />
! Name !! Location !! Quota !! Backup? !! Usage <br />
|-<br />
| Home || /nfs/home/<userid> || 500 GB || Weekly || General use - not large database and not docking runs. <br />
|-<br />
| Work || /nfs/work/<userid> || 512GB* || Never || Docking runs or calculations that produce large output<br />
|- <br />
| DB || /nfs/db/<userid> || 2TB* || Quarterly || Database files, often persistent, rarely modifed<br />
|-<br />
| Store || /nfs/store/<userid> || 512GB* || Quarterly || Crystallography files or other archived files, rarely modified.<br />
|-<br />
| Desktop || /home/<userid> || N/A || Never || Copy critical files from desktop to home directory on server.<br />
|-<br />
| Laptop || / || N/A || Never || Copy critical files from laptop to home directory on server.<br />
|}<br />
* More space may be requested from the [[sysadmin]]s.<br />
<br />
= See Also = <br />
* [[Backups]]<br />
<br />
<br />
<br />
[[Category:Internal]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Disk_space_policy&diff=7882Disk space policy2014-05-30T17:01:02Z<p>Teague Sterling: </p>
<hr />
<div>This is the lab disk space policy and applies to those with ssh access to our cluster only. This policy currently applies to [[cluster 2]]. It will apply to [[cluster 0]] with caveats by June 1, 2014. <br />
We ask you to observe the following disk usage policy for your account on the cluster. We think this policy will cover 90% of the users, 90% of the time. We do understand that research can have unanticipated needs, and we are willing to work with you to create a plan that works for everyone.<br />
<br />
{{TOCright}}<br />
<br />
= Checking Disk Usage =<br />
You can check your disk usage on the cluster NFS (excluding /nfs/db) drives by running the `quota` command on any cluster machine.<br />
<br />
Example Output:<br />
<br />
[teague@gimel ~]$ quota -s<br />
Disk quotas for user teague (uid 42001): <br />
Filesystem blocks quota limit grace files quota limit grace<br />
nfs_home:/export/home/<br />
321M 500G 512G 472 0 0 <br />
nfs_work:/export/work/<br />
114M 500G 512G 3 0 0 <br />
nfs_store:/export/store/<br />
4 500G 512G 1 0 0 <br />
<br />
= Treatment by disk type =<br />
When you log in to a cluster computer, you are in your home directory. This would normally be /nfs/home/<userid>. Lab tradition is to have a ~/code/ directory for all software and a ~/work/ directory for all of your research. Your home directory is backed up weekly, and has a quota of 500 GB. Keeping the quota at this level will allow us to provide a high performance backup system that runs regularly and completes in a timely fashion.<br />
<br />
== Docking jobs ==<br />
If you require more than your quota for ~/work/, we will create a special directory for you, e.g. /nfs/work/<userid>, which we suggest you symlink to ~/work/. We expect most lab members engaged in multiple docking projects will require space on /nfs/work. /nfs/work will never be backed up. We recommend you repatriate key files from time to time from /nfs/work/<userid> to e.g. ~/worksave/. We can help you write scripts to automate this. You are also welcome to make your own supplementary backups to a USB drive on your workstation or laptop. We have USB drives you may have for this purpose.<br />
<br />
Historical note: in the past, we used to attempt to backup nearly every file on every disk, including millions of docking job output files that had a transient existence. This resulted in slow computers - due to the backup process - and a lack of clarity of what was backed up. It was unsatisfactory. One problem has been clarity: how to understand what is backed up and what is not backed up. By backing up only /nfs/home/<userid> we are forcing you to choose the key input files from which you can re-create your output. You may also wish to use /nfs/store/<userid> to retain important results. But we want to get away from the idea of backing up every file on every disk, particularly those of transient existence and marginal value.<br />
<br />
== Dockable database files ==<br />
If you create dockable database files that persist and are larger than your quota, ask us to create dedicated space for you, e.g. /nfs/db/<userid>. The hallmarks of database files is that they are large, they are often used by more than one person, and they are persistent. Database files are backed up quarterly beginning on the first day of the quarter.<br />
<br />
<br />
[[Category:Policy]]<br />
[[Category:Internal]]<br />
<br />
== Large datasets, completed projects ==<br />
If you have large files from the synchrotron or elsewhere, or a project has finished but you want to keep it online, ask us to create dedicated space for you, e.g. /nfs/store/<userid>. Storage files are backed up quarterly beginning on the first day of the quarter. <br />
<br />
If after using /nfs/db, /nfs/work and /nfs/store to offload from /nfs/home/ you still need more space, please work with the [[sysadmin]]s to help you.<br />
<br />
= Workstation and Laptop =<br />
When you log in to your workstation, you may be in a local directory on the workstation. Our policy is to never back up workstations. You may back up your own workstation, and you may make additional backups of your data on the cluster by writing to USB disks mounted on your desktop or laptop. We offer as a parting gift from the lab two multi-TB drives to which you may copy your files. Make one copy to take with you and one copy for us to keep safe for you. Label them clearly in ink. Then, delete the files from the server as you leave the lab, provided the project has been published.<br />
<br />
= Summary = <br />
We back up home directories weekly, and selected database, crystallography and other archived files quarterly. We do not back up /nfs/work files and we do not back up workstations or laptops. To backup files from /nfs/work, you must repatriate the key files to your home directory whence they will be backed up as part of normal procedures. To backup laptop and desktop, use USB disks, which we can give you.<br />
<br />
Our backup systems cannot keep up with the growth in disk space and the voracious appetite for disk of group members. This is a pragmatic policy that, while requiring a little organization, is more likely to deliver restored files you can actually rely on.<br />
<br />
As always, we welcome [[feedback]].<br />
<br />
{| class="wikitable"<br />
|-<br />
! Name !! Location !! Quota !! Backup? !! Usage <br />
|-<br />
| Home || /nfs/home/<userid> || 500 GB || Weekly || General use - not large database and not docking runs. <br />
|-<br />
| Work || /nfs/work/<userid> || 512GB* || Never || Docking runs or calculations that produce large output<br />
|- <br />
| DB || /nfs/db/<userid> || 2TB* || Quarterly || Database files, often persistent, rarely modifed<br />
|-<br />
| Store || /nfs/store/<userid> || 512GB* || Quarterly || Crystallography files or other archived files, rarely modified.<br />
|-<br />
| Desktop || /home/<userid> || N/A || Never || Copy critical files from desktop to home directory on server.<br />
|-<br />
| Laptop || / || N/A || Never || Copy critical files from laptop to home directory on server.<br />
|}<br />
* More space may be requested from the [[sysadmin]]s.<br />
<br />
= See Also = <br />
* [[Backups]]<br />
<br />
<br />
<br />
[[Category:Internal]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Disk_space_policy&diff=7880Disk space policy2014-05-29T21:13:10Z<p>Teague Sterling: /* Summary */</p>
<hr />
<div>This is the lab disk space policy and applies to those with ssh access to our cluster only. This policy currently applies to [[cluster 2]]. It will apply to [[cluster 0]] with caveats by June 1, 2014. <br />
We ask you to observe the following disk usage policy for your account on the cluster. We think this policy will cover 90% of the users, 90% of the time. We do understand that research can have unanticipated needs, and we are willing to work with you to create a plan that works for everyone.<br />
<br />
{{TOCright}}<br />
<br />
= Treatment by disk type =<br />
When you log in to a cluster computer, you are in your home directory. This would normally be /nfs/home/<userid>. Lab tradition is to have a ~/code/ directory for all software and a ~/work/ directory for all of your research. Your home directory is backed up weekly, and has a quota of 500 GB. Keeping the quota at this level will allow us to provide a high performance backup system that runs regularly and completes in a timely fashion.<br />
== Docking jobs ==<br />
If you require more than your quota for ~/work/, we will create a special directory for you, e.g. /nfs/work/<userid>, which we suggest you symlink to ~/work/. We expect most lab members engaged in multiple docking projects will require space on /nfs/work. /nfs/work will never be backed up. We recommend you repatriate key files from time to time from /nfs/work/<userid> to e.g. ~/worksave/. We can help you write scripts to automate this. You are also welcome to make your own supplementary backups to a USB drive on your workstation or laptop. We have USB drives you may have for this purpose.<br />
<br />
Historical note: in the past, we used to attempt to backup nearly every file on every disk, including millions of docking job output files that had a transient existence. This resulted in slow computers - due to the backup process - and a lack of clarity of what was backed up. It was unsatisfactory. One problem has been clarity: how to understand what is backed up and what is not backed up. By backing up only /nfs/home/<userid> we are forcing you to choose the key input files from which you can re-create your output. You may also wish to use /nfs/store/<userid> to retain important results. But we want to get away from the idea of backing up every file on every disk, particularly those of transient existence and marginal value.<br />
<br />
== Dockable database files ==<br />
If you create dockable database files that persist and are larger than your quota, ask us to create dedicated space for you, e.g. /nfs/db/<userid>. The hallmarks of database files is that they are large, they are often used by more than one person, and they are persistent. Database files are backed up quarterly beginning on the first day of the quarter.<br />
<br />
<br />
[[Category:Policy]]<br />
[[Category:Internal]]<br />
<br />
== Large datasets, completed projects ==<br />
If you have large files from the synchrotron or elsewhere, or a project has finished but you want to keep it online, ask us to create dedicated space for you, e.g. /nfs/store/<userid>. Storage files are backed up quarterly beginning on the first day of the quarter. <br />
<br />
If after using /nfs/db, /nfs/work and /nfs/store to offload from /nfs/home/ you still need more space, please work with the [[sysadmin]]s to help you.<br />
<br />
= Workstation and Laptop =<br />
When you log in to your workstation, you may be in a local directory on the workstation. Our policy is to never back up workstations. You may back up your own workstation, and you may make additional backups of your data on the cluster by writing to USB disks mounted on your desktop or laptop. We offer as a parting gift from the lab two multi-TB drives to which you may copy your files. Make one copy to take with you and one copy for us to keep safe for you. Label them clearly in ink. Then, delete the files from the server as you leave the lab, provided the project has been published.<br />
<br />
= Summary = <br />
We back up home directories weekly, and selected database, crystallography and other archived files quarterly. We do not back up /nfs/work files and we do not back up workstations or laptops. To backup files from /nfs/work, you must repatriate the key files to your home directory whence they will be backed up as part of normal procedures. To backup laptop and desktop, use USB disks, which we can give you.<br />
<br />
Our backup systems cannot keep up with the growth in disk space and the voracious appetite for disk of group members. This is a pragmatic policy that, while requiring a little organization, is more likely to deliver restored files you can actually rely on.<br />
<br />
As always, we welcome [[feedback]].<br />
<br />
{| class="wikitable"<br />
|-<br />
! Name !! Location !! Quota !! Backup? !! Usage <br />
|-<br />
| Home || /nfs/home/<userid> || 500 GB || Weekly || General use - not large database and not docking runs. <br />
|-<br />
| Work || /nfs/work/<userid> || 512GB* || Never || Docking runs or calculations that produce large output<br />
|- <br />
| DB || /nfs/db/<userid> || 2TB* || Quarterly || Database files, often persistent, rarely modifed<br />
|-<br />
| Store || /nfs/store/<userid> || 512GB* || Quarterly || Crystallography files or other archived files, rarely modified.<br />
|-<br />
| Desktop || /home/<userid> || N/A || Never || Copy critical files from desktop to home directory on server.<br />
|-<br />
| Laptop || / || N/A || Never || Copy critical files from laptop to home directory on server.<br />
|}<br />
* More space may be requested from the [[sysadmin]]s.<br />
<br />
= See Also = <br />
* [[Backups]]<br />
<br />
<br />
<br />
[[Category:Internal]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Disk_space_policy&diff=7879Disk space policy2014-05-29T21:12:49Z<p>Teague Sterling: /* Summary */</p>
<hr />
<div>This is the lab disk space policy and applies to those with ssh access to our cluster only. This policy currently applies to [[cluster 2]]. It will apply to [[cluster 0]] with caveats by June 1, 2014. <br />
We ask you to observe the following disk usage policy for your account on the cluster. We think this policy will cover 90% of the users, 90% of the time. We do understand that research can have unanticipated needs, and we are willing to work with you to create a plan that works for everyone.<br />
<br />
{{TOCright}}<br />
<br />
= Treatment by disk type =<br />
When you log in to a cluster computer, you are in your home directory. This would normally be /nfs/home/<userid>. Lab tradition is to have a ~/code/ directory for all software and a ~/work/ directory for all of your research. Your home directory is backed up weekly, and has a quota of 500 GB. Keeping the quota at this level will allow us to provide a high performance backup system that runs regularly and completes in a timely fashion.<br />
== Docking jobs ==<br />
If you require more than your quota for ~/work/, we will create a special directory for you, e.g. /nfs/work/<userid>, which we suggest you symlink to ~/work/. We expect most lab members engaged in multiple docking projects will require space on /nfs/work. /nfs/work will never be backed up. We recommend you repatriate key files from time to time from /nfs/work/<userid> to e.g. ~/worksave/. We can help you write scripts to automate this. You are also welcome to make your own supplementary backups to a USB drive on your workstation or laptop. We have USB drives you may have for this purpose.<br />
<br />
Historical note: in the past, we used to attempt to backup nearly every file on every disk, including millions of docking job output files that had a transient existence. This resulted in slow computers - due to the backup process - and a lack of clarity of what was backed up. It was unsatisfactory. One problem has been clarity: how to understand what is backed up and what is not backed up. By backing up only /nfs/home/<userid> we are forcing you to choose the key input files from which you can re-create your output. You may also wish to use /nfs/store/<userid> to retain important results. But we want to get away from the idea of backing up every file on every disk, particularly those of transient existence and marginal value.<br />
<br />
== Dockable database files ==<br />
If you create dockable database files that persist and are larger than your quota, ask us to create dedicated space for you, e.g. /nfs/db/<userid>. The hallmarks of database files is that they are large, they are often used by more than one person, and they are persistent. Database files are backed up quarterly beginning on the first day of the quarter.<br />
<br />
<br />
[[Category:Policy]]<br />
[[Category:Internal]]<br />
<br />
== Large datasets, completed projects ==<br />
If you have large files from the synchrotron or elsewhere, or a project has finished but you want to keep it online, ask us to create dedicated space for you, e.g. /nfs/store/<userid>. Storage files are backed up quarterly beginning on the first day of the quarter. <br />
<br />
If after using /nfs/db, /nfs/work and /nfs/store to offload from /nfs/home/ you still need more space, please work with the [[sysadmin]]s to help you.<br />
<br />
= Workstation and Laptop =<br />
When you log in to your workstation, you may be in a local directory on the workstation. Our policy is to never back up workstations. You may back up your own workstation, and you may make additional backups of your data on the cluster by writing to USB disks mounted on your desktop or laptop. We offer as a parting gift from the lab two multi-TB drives to which you may copy your files. Make one copy to take with you and one copy for us to keep safe for you. Label them clearly in ink. Then, delete the files from the server as you leave the lab, provided the project has been published.<br />
<br />
= Summary = <br />
We back up home directories weekly, and selected database, crystallography and other archived files quarterly. We do not back up /nfs/work files and we do not back up workstations or laptops. To backup files from /nfs/work, you must repatriate the key files to your home directory whence they will be backed up as part of normal procedures. To backup laptop and desktop, use USB disks, which we can give you.<br />
<br />
Our backup systems cannot keep up with the growth in disk space and the voracious appetite for disk of group members. This is a pragmatic policy that, while requiring a little organization, is more likely to deliver restored files you can actually rely on.<br />
<br />
As always, we welcome [[feedback]].<br />
<br />
{| class="wikitable"<br />
|-<br />
! Name !! Location !! Quota !! Backup? !! Usage <br />
|-<br />
| Home || /nfs/home/<userid> || 500 GB || Weekly || General use - not large database and not docking runs. <br />
|-<br />
| Work || /nfs/work/<userid> || 512GB* || Never || Docking runs or calculations that produce large output<br />
|- <br />
| DB || /nfs/db/<userid> || 2TB* || Quarterly || Database files, often persistent, rarely modifed<br />
|-<br />
| Store || /nfs/store/<userid> || 1TB* || Quarterly || Crystallography files or other archived files, rarely modified.<br />
|-<br />
| Desktop || /home/<userid> || N/A || Never || Copy critical files from desktop to home directory on server.<br />
|-<br />
| Laptop || / || N/A || Never || Copy critical files from laptop to home directory on server.<br />
|}<br />
* More space may be requested from the [[sysadmin]]s.<br />
<br />
= See Also = <br />
* [[Backups]]<br />
<br />
<br />
<br />
[[Category:Internal]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=ZINC_Novelty_Score&diff=5359ZINC Novelty Score2013-02-12T19:08:23Z<p>Teague Sterling: </p>
<hr />
<div>The ZINC Novelty Score (ZNS) is a statistic to express how unusual a molecule is compared to what is in ZINC. It is calculated automatically following a ZINC search in the new interface. The score is calculated as follows: <br />
<br />
ZNS = 1.0 - (Tc(ecpf4) + Tc(path))/2 * 100 %<br />
<br />
Thus molecules that are in ZINC have Tc of 1.0, and a ZNS of 0%. Molecules that are related but different to molecules in ZINC will have small ZNS scores, and molecules will approach novelty when they have no features in common with any molecules in ZINC. <br />
<br />
There are three variants:<br />
<br />
* ZNS(target) : The novelty of the compound with respect to known (annotated) compounds for that target. <br />
<br />
* ZNS(target-pattern) : The novelty of the compound with respect to known (annotated) compounds matching a particular target pattern. <br />
<br />
* ZNS(*) : A special case of the above, this statistics says: how novel is the compound compared to any compound with any ChEMBL annotation (10uM or better)<br />
<br />
* ZNS(): Novelty compared to all molecules in ZINC, whether they are commercially available or not. <br />
<br />
* ZPNS(): Commercially available novelty. How novel is this compound compared to what is on the market, as reflected in ZINC. Thus if a molecule is commercially available, then its ZPNS() or ZINC Purchasable Novelty Score is 0%. A compound that is known, and even that has been for sale in the past, may still have a high ZPNS if nothing like it is currently on the market, as reflected in ZINC. <br />
<br />
<br />
[[Category:Statistics]]<br />
[{Category:ZINC]]</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=Create_decoy_tables&diff=5282Create decoy tables2012-10-24T20:33:29Z<p>Teague Sterling: </p>
<hr />
<div>=== Build on Fastest DB Method ===<br />
These commands are available on zincdb1 with corresponding readme file.<br />
<br />
# mysql zinc_stage -p -e "DROP TABLE IF EXISTS decoyprot_tmp;"<br />
# mysql zinc_state -p -e "CREATE TABLE decoyprot_tmp (smiles CHAR(255) NOT NULL, sub_id_fk INT(10) UNSIGNED NOT NULL, prot_id INT(10) UNSIGNED NOT NULL, net_charge INT(2) NOT NULL, n_h_donors INT(2) UNSIGNED NOT NULL, n_h_acceptors INT(2) UNSIGNED NOT NULL, rb INT(2) UNSIGNED NOT NULL, mwt DOUBLE NOT NULL, xlogP FLOAT(5,2), UNIQUE prot_id_idx (prot_id), INDEX decoyport_idx (net_charge, n_h_donors, n_h_acceptors, rb, mwt, xlogP));"<br />
# mysql zinc_state -p -e "INSERT INTO decoyprot_tmp (prot_id, smiles, sub_id_fk, net_charge, n_h_donors, n_h_acceptors, rb, mwt, xlogP) SELECT prot_id, smiles, p.sub_id_fk, net_charge, n_h_donors, n_h_acceptors, rb, mwt, xlogP FROM zinc8.protomer AS p RIGHT JOIN zinc8.catalog_item AS ci ON p.sub_id_fk=ci.sub_id_fk LEFT JOIN zinc8.catalog AS c ON ci.cat_id_fk=c.cat_id WHERE c.free = 1 AND c.purchasable in (1,2,4,5) ON DUPLICATE KEY UPDATE decoyprot_tmp.prot_id=decoyprot_tmp.prot_id;"<br />
# mysqldump -p<PASS> zincstage decoyprot_tmp | gzip -c | ssh zincdb6 "gzip -dc | mysql -p<PASS> zincstage"<br />
# '''On ZincDB6:''' mysql zinc8 -p -e "DROP TABLE IF EXISTS decoyprot;RENAME TABLE zincstage.decoyprot_tmp TO zinc8.decoyprot_tmp;"<br />
<br />
=== Old Manual Migration Method (In Case of Errors): ===<br />
<br />
Create Decoy Tables<br />
<br />
* 1: On zincdb4 (or zincdb6), in a screen session, run ~root/generate_decoyprot.sh (takes many hours)<br />
* 2: Shutdown zincdb4 mysqld<br />
* 3: Copy /var/lib/mysql/zinc8/decoyprot.* to zincdb2, zincdb3 somewhere<br />
* 4: Turn zincdb4 back on (This part may be possible without mysqld shutdown/startup, I'm just doing it for saftey)<br />
<br />
On each zincdb1, 4, 6:<br />
* 1: Shutdown mysqld<br />
* 2: move the decoyprot.* files copied above into /var/lib/mysql/zinc8/<br />
* 3: Start mysqld<br />
<br />
[[Category:Internal]]<br />
<br />
<br />
=== March 20, 2010 version from Pascal (unformatted) ===<br />
<br />
Procedure is as follows:<br />
<br />
Run the decoyprot generation script<br />
zincdb4:/root/scripts/generate_decoyprot.sh (wait many hours)<br />
Shutdown mysql on zincdb4<br />
copy /var/lib/mysql/zinc8/decoyprot.* to zincdb2, zincdb3, to a<br />
temporary location. Copy takes 3-5minutes currently<br />
Restart mysql on zincdb4<br />
<br />
On zincdb2, 3:<br />
Stop mysqld<br />
Move the decoyprot.* copies into /var/lib/mysql/zinc8/ (May take a few<br />
minutes even though its a move)<br />
Make sure the copied decoyprot.* files are owned by mysql:mysql<br />
Start mysqld<br />
<br />
Validate that the tables work:<br />
<br />
select count(*) from zinc8.decoyprot;<br />
select * from zinc8.decoyprot limit 10;</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=User_talk:TBalius&diff=5222User talk:TBalius2012-10-12T23:23:52Z<p>Teague Sterling: Welcome!</p>
<hr />
<div>'''Welcome to ''DISI''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [[Help:Contents|help pages]].<br />
Again, welcome and have fun! [[User:Teague Sterling|Teague Sterling]] ([[User talk:Teague Sterling|talk]]) 19:23, 12 October 2012 (EDT)</div>Teague Sterlinghttp://wiki.docking.org/index.php?title=User:TBalius&diff=5221User:TBalius2012-10-12T23:23:52Z<p>Teague Sterling: Creating user page for new user.</p>
<hr />
<div>Trent Balius is a postdoc in the Shoichet Lab.</div>Teague Sterling