Revision as of 17:33, 17 August 2017

Cluster 2

See progress of raid rebuild, eg on aleph.

cat /proc/mdstat

fire up a vm per sarah/matt

ssh to he as s_xxx
sudo virsh vncdisplay phi  # Shows the VNC port phi is running on (vnc port 0)
sudo vrish edit phi  # Open phi's config
# search for passwd ( /passwd<ENTER> )
#copy down VNC password
#:q!  # Exit vim
exit  # Exit virsh
exit  # Log out of he
vncviewer he:<VNCPORT>  (vncviewer he:0)
Enter password
log in restart: sshd, mysql, iptables, network (if it can't ping)

Cluster 0

Disc space panic (cluster 0)

sudo /usr/sbin/repquota /raid1 | sort -nrk3 | head

Save time as sysadmin C0

~teague/Scripts/sshnodes.py to call ~teague/batch/mount-diva2.sh

So obvious as to no longer be worth of being documented

Clear errors on jobs

qstat -u adler | grep Eqw | cut -f 1 -d ' ' | xargs qmod -cj

Start/restart ZINC15

source env.csh
zincserver.restart-backend.sh

queue stuck?

Try qmod -c *lamed and qmod -e *@lamed

clean away old scratch files on nodes before your job starts (adler as example user)

find /scratch/adler -mindepth 1 -mtime +3 -exec rm -rvf {} \;

Restart ZINC15

cd /nfs/soft/www/apps/zinc15/zinc15-env/lib/python2.7/site-packages/zinc/data/models
source /nfs/soft/www/apps/zinc15/zinc15-env/env.csh
zincserver.restart-backend.sh
zincserver.start-backend.sh
killall -9 gunicorn

Sysadmin idioms: Difference between revisions

Revision as of 17:33, 17 August 2017

Cluster 2

Cluster 0

So obvious as to no longer be worth of being documented

Navigation menu

Sysadmin idioms: Difference between revisions

Revision as of 17:33, 17 August 2017

Cluster 2

Cluster 0

So obvious as to no longer be worth of being documented

Navigation menu

Search