Cluster 2: Difference between revisions

From DISI
Jump to navigation Jump to search
Line 10: Line 10:


= Equipment, names, roles =
= Equipment, names, roles =
* '''512 cores for queued jobs and 128 cores for infrastructure, databases, management and ad hoc jobs. 128 TB of high quality disk, 32 TB of other disk'''
* '''512 cpu-cores for queued jobs and 128 cpu-cores for infrastructure, databases, management and ad hoc jobs. 128 TB of high quality disk, 32 TB of other disk'''
* We expect this to grow to over 1200 cpu-cores and 200 TB in 2014 once Cluster 0 is merged with Cluster 2  
* We expect this to grow to over 1200 cpu-cores and 200 TB in 2014 once Cluster 0 is merged with Cluster 2  
* Our policy is to have 4 GB RAM per cpu-core unless otherwise specified.
* The Hebrew alphabet is used for physical machines, Greek for VMs. Functions (e.g. sgehead) are aliases (CNAMEs).  
* The Hebrew alphabet is used for physical machines, Greek for VMs. Functions (e.g. sgehead) are aliases (CNAMEs).  
* Cluster 2 is currently stored entirely in Rack 0 which is in Row 0, Position 4 of BH101 at 1700 4th St (Byers Hall). '''More racks will be added by July.'''
* Cluster 2 is currently stored entirely in Rack 0 which is in Row 0, Position 4 of BH101 at 1700 4th St (Byers Hall). '''More racks will be added by July.'''
* Core services are on aleph, an HP DL160G5. Using a libvirt hypervisor, aleph runs all core services.  
* Central services are on aleph, an HP DL160G5 and bet, an HP xxxx.  
* CPU
* CPU
** 3 Silicon Mechanics Rackform nServ A4412.v4 s, each comprising 4 computers of 32 cpu-cores for a total of 384 cpu-cores.
** 3 Silicon Mechanics Rackform nServ A4412.v4 s, each comprising 4 computers of 32 cpu-cores for a total of 384 cpu-cores.

Revision as of 14:38, 23 April 2014

Our new cluster at UCSF is described on this page. The physical equipment in Cluster 0 will be subsumed into this cluster when Cluster 2 replicates all the functions of Cluster 0. We expect this to happen later in 2014.

Priorities and Policies

Equipment, names, roles

  • 512 cpu-cores for queued jobs and 128 cpu-cores for infrastructure, databases, management and ad hoc jobs. 128 TB of high quality disk, 32 TB of other disk
  • We expect this to grow to over 1200 cpu-cores and 200 TB in 2014 once Cluster 0 is merged with Cluster 2
  • Our policy is to have 4 GB RAM per cpu-core unless otherwise specified.
  • The Hebrew alphabet is used for physical machines, Greek for VMs. Functions (e.g. sgehead) are aliases (CNAMEs).
  • Cluster 2 is currently stored entirely in Rack 0 which is in Row 0, Position 4 of BH101 at 1700 4th St (Byers Hall). More racks will be added by July.
  • Central services are on aleph, an HP DL160G5 and bet, an HP xxxx.
  • CPU
    • 3 Silicon Mechanics Rackform nServ A4412.v4 s, each comprising 4 computers of 32 cpu-cores for a total of 384 cpu-cores.
    • 1 Dell C6145 with 128 cores.
    • An HP DL165G7 (24-way) is sgehead
  • DISK
    • HP disks - new in 2014 - 40 TB RAID6 SAS
    • Silicon Mechanics NAS - new in 2014 - 76 TB RAID6 SAS
    • A HP DL160G5 and an MSA60 with 12 TB SAS - new in 2014.

Disk organization

  • shin aka nas1 mounted as /nfs/db/ = 72 TB SAS RAID6
  • bet aka happy, internal: /nfs/store and psql (temp) as 10 TB SATA RAID10
  • elated on happy: /nfs/work only as 36 TB SAS RAID6
  • het (43) aka former vmware2 MSA 60 exports /nfs/home and /nfs/soft

Special purpose machines - all .ucsf.bkslab.org

  • sgehead- we recommend you use this - in addition to your desktop - for most purposes, including launching jobs on the cluster.
  • pgf - fortran compiler
  • portal - secure access from
  • ppilot - pipeline pilot
  • shin, bet, and dalet are the three NFS servers. You should not need to log in.
  • mysql1 - general purpose mysql server (like former scratch)
  • pg1 - general purpose postgres server
  • fprint - fingerprinting server

About our cluster