So you want to set up a lab

From DISI
Jump to navigation Jump to search

Here is how I would set up a computational lab, one that could join the Worldwide ZINC network.

Requirements and assumptions

This page describes setting up a full computational pharmacology research lab. We describe the minimum setup, plus options for expansion. The entry level assumes you can spend: $20K on servers and $2K on a workstation. You'll need an acoustically insulated air conditioned room. It would help to have someone with system administration experience for 100 hours (3 weeks) for the initial set up, and then 1 day/month for maintenance.

When you have all the software and hardware, you can get a docking lab up and running at a basic level in less than a day. If you do not have all the software and hardware, it will take longer. If you've never run a molecular docking lab, allow a week to learn the basics. Once you have a basic installation running, plan for a few days to get a queuing system working the way you like it. We may be able to help. Ask us, but please read this wiki first.

Physical level

We describe an entry level system that can be easily expanded.


Here is our recommendation of how to build a computer cluster that will work well for molecular docking and cheminformatics. This document can be used whether you already own hardware, or whether you are planning to buy new. All recommendations are the best we know as of Feb 2014. Things do change, but this advice should be ok through 2015.

About buying CPU and disk

We are currently buying CPU from Silicon Mechanics and Dell and disk from Silicon Mechanics and HP. We recommend buying two different kinds of machines: headnodes to which disk enclosures may be attached and cpu node which contain large numbers of cores. We are currently buying enclosures holding 12 SAS disks of 4TB each for 48 TB raw for around $8000 or about 6 raw GB per dollar. Formatted RAID6 this works out to 36TB or 4.5 formatted GB per dollar. We like the HP P822 high performance RAID controller.

Compare this to what we were paying just a year ago in spring 2013: 25 TB for $10,000 or 2.5 TB per dollar unformatted. An amazing development in the last 12 months.

For CPU, we like the C6145 from Dell. For around $20,000 you get 2 machines in a 2U form each with 64 cores and 256 GB memory and a pair of RAID-1 formatted disks each. A single 42U rack could hold 2560 cores and still have room for a switch. Of course, this would cost you $400,000, pull 28 kW and need 5 T of cooling. Amazing.

About setting up a network

Public and private. $250 managed switch with VPN support.

Central services setup

To begin, you will either need 6 computers to host the central services, or you will need a hypervisor to host 6 VMs, or some mixture of the above. We recommend the hypervisor if you can bear it and the 6 physical computers if you can afford it (space, energy, money).

Hypervisor

We use (xxx), but any should do, including virtualbox, vmware, among many others.

Foreman

Foreman is the node creation and provisioning server. Here is how to set one up:

Authentication server

We use 389, but other authentication systems will work fine, including kerberos.

Fileservers and NFS

We use XFS over NFS. We tend to hang several enclosures off a head node. We recommend SAS, which has finally come down in price, and RAID6 formatting. We tend to use enclosures that host 12 disks of 4TB each.

Portal and Security

We recommend setting up a portal and blocking all inbound access to all other computers. Use two portals at distinct geographical locations for added robustness.

CPU nodes

We like the machines with 4 chips of 16 cores (AMD) for 64 cores. Two of these machines fit in a 2U enclosure, and take 256GB RAM each. We recommend SSDs 480 GB 2x2 RAID10 if you can afford it.

Queuing system

We recommend free versions of Sun Grid Engine SGE.

Maintaining your cluster

  • one
  • two

Operating System level

  • * O/S: We recommend Centos 6.3.
  • Foreman
  • DNS
  • Authentication
  • Perimeter security.
  • SGE

Set up a database server

Add a new node to the cluster

How to spin up a new virtual machine

Add new disk to the cluster

Configure new disk

Deploy a workstation

Workstation Install

Middleware level

Set up middleware for a computational drug discovery lab.

Set up psql/rdkit from scratch

Server Application Software

Install DOCK 3.7

Install SEA

Install DOCK 6

Install ZINC

Install DOCK Blaster

ZINC

Here is how to set up ZINC from scratch on a new cluster.

  • create database
  • install software and web interface
  • test


Using git and github

Client Application Software

PyMol

Chimera

Knime

Marvin

InstantJChem

Operations

Monthly maintenance tasks.

Backups

Security Review

Software upgrades

CCP4

Chimera

H. acquire and install docking.org software

dockenv

I. configure docking.org software