How to Install a Desktop on Cluster 2: Difference between revisions

From DISI
Jump to navigation Jump to search
 
(26 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Installation==
==Installation==
Install centos
===CentOS 6===
Check and see if there is an eth listed
Install centos, do the following commands as root:
ls /etc/sysconfig/network-scripts/
If none exists:
ifconfig >> /etc/sysconfig/network-scripts/ifcfg-eth0
vim /etc/sysconfig/network-scripts/ifcfg-eth0
Delete everything except the mac address
Have the following lines:
HWADDR=”XXX”
BOOTPROTO=”dhcp”
ONBOOT=”yes”
NM_CONTROLLED=”yes”
DEVICE=”eth0”
TYPE=”ethernet”
  wget http://yum.ucsf.bkslab.org/SETUP/desktop.sh
  wget http://yum.ucsf.bkslab.org/SETUP/desktop.sh
change hostname to desired hostname
  sh desktop.sh
  sh desktop.sh
Answer yes if hostname is correct, answer yes to add an ifcfg network-scripts file.
Once it’s done, DON’T HIT ENTER
Once it’s done, DON’T HIT ENTER
Enter the information into foreman (new machine, etc)
Enter the information into foreman (new machine, etc)
Line 30: Line 20:
  baseurl=http://yum/centos/6/sclo/$basearch/sclo/
  baseurl=http://yum/centos/6/sclo/$basearch/sclo/


Most GPU machines will require cuda and nvidia packages provided by foreman.  Sometimes, those puppet scripts will not provide the correct packages/drivers (possible error with puppet script?).  If so, then you must install them manually (see GPU Issues in 'Troubleshooting installation issues' below).   
Most GPU machines will require cuda and nvidia packages provided by foreman.  Sometimes, those puppet scripts will not provide the correct packages/drivers (possible error with puppet script?).  If so, then you must install them manually (see GPU Issues in 'Troubleshooting installation issues' below).
 
===CentOS 7===
This will be a work in progress.
 
Get the CentOS 7 ISO from: http://isoredirect.centos.org/centos/7/isos/x86_64/CentOS-7-x86_64-DVD-1804.iso
On newer desktops, we have to turn off secure boot in the BIOSThere's a possibility the wired NIC won't show up.  Once the install is complete, update the kernel.  Updated kernels should be able to take advantage of the newer motherboard's features.
 
packages that centos6 had, that we will likely need:
 
zsh diffutils curl wget  dos2unix unix2dos screen tmux vim-enhanced emacs fuse glusterfs
autoconf automake binutils bison flex gcc gcc-c++ gettext libtool make patch pkgconfig redhat-rpm-config rpm-build byacc cscope ctags cvs diffstat doxygen elfutils gcc-gfortran git indent intltool patchutils rcs subversion swig systemtap ElectricFence ant babel bzr chrpath cmake cmake 3compat-gcc-34 compat-gcc-34-c++ compat-gcc-34-g77 cvs-inetd dejagnu expect gcc-gnat gcc-java gcc-objc gcc-objc++ imake jpackage-utils kdewebdev ksc libstdc++-docs mercurial mod_dav_svn nasm perltidy python-docs rpmdevtools rpmlint systemtap-sdt-devel systemtap-server
 
Libraries:
glibc glibc-devel glibc-static libffi libffi-devel openssl openssl-devel openssl-static libstdc++ libstdc++-devel libXScrnSaver
(compatibility to run DOCK): compat-db47
 
Parallelism:
 
openmpi mpich-3.2 mpich-3.2-devel
 
Compression:
zlib zlib-devel zlib-static libzip libzip-devel bzip2-libs bzip2-devel p7zip p7zip-plugins
 
Math Libs:
atlas atlas-devel atlas-sse2 atlas-sse3 blas blas-devel lapack lapack-devel hdf5 hdf5-devel hdf5-static hdf5-openmpi hdf5-openmpi-devel hdf5-openmpi-static eigen3-devel
 
GUI:
freetype freetype-devel libpng libpng-devel libpng-static tk tk-devel tcl tcl-devel cairo cairo-devel wxGTK-devel wxGTK3-devel libotf
 
DB Libraries:
mysql mysql-devel sqlite sqlite-devel postgresql postgresql-devel libdb4 libdb4-devel libdb4-utils
 
Dependencies:
boost boost-python boost-regex boost-devel openbabel freeglut freeglut-devel
 
Languages:
R R-devel R-core R-core-devel R-java R-java-devel python-lockfile
 
XML:
libxml2 libxml2-devel
 
Auth:
sssd sssd-client nfs-utils nss-pam-ldapd
 
Printers:
hplip hplip-gui hpijs
 
 
Repos to install:
# Install repos
yum install -y epel-release
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
 
 
files to create:
/etc/resolv.conf with
search desktop.ucsf.bkslab.org compbio.ucsf.edu ucsf.bkslab.org bkslab.org ucsf.edu
nameserver <alpha>
nameserver 128.218.254.10
nameserver 128.218.254.40
 
packages:
autofs
nvidia-detect (elrepo)
kmod-nvidia (elrepo)
openbabel (epel)
libcurl-devel sqlite-devel (base)  (for onenote)
 
 
Services to start:
systemctl start oddjobd (to enable homedir creation when normal user logs in)
autofs
 
Packages to install:
libXScrnSaver - so nfs-soft chimera can run
python36              - for modern python!
python36-devel
 
autofs:
auto.master:
Create /nfs directory filled with symlinks that point to /mnt/nfs/* mounts. 
 
/etc/auto.master
/etc/auto.bks
 
Testing to be done:
run software from nfs-soft (works: chimera)
autofs
create script to do all the above
 
Issues to fix: authentication/LDAP.  /var/log/secure login entries show that pam_unix(gdm-password:auth) fails.  pam_sss(gdm-password:auth) succeeds.
sudo authconfig --useshadow --passalgo=sha256 --enablelocauthorize --enablesssd --enablesssdauth --enablecache --enablepamaccess --enableldap --enableldapauth --enableldaptls --ldapserver="ldaps://ds.ucsf.bkslab.org" --ldapbasedn="dc=bkslab,dc=org" --disablekrb5 --enablemkhomedir --update
 
 
LDAP+TLS will not run unless you have a CA certificate in /etc/openldap/cacerts.


==Troubleshooting installation issues==
==Troubleshooting installation issues==
===Kernel Panic During Installation/Boot on CentOS 6.8 USB Stick===
A few particular computers are older and have an outdated BIOS.  They don't support installation through USB stick.  You can workaround this by installing CentOS through a CD or you can update the BIOS. 
===Exception Error During Installation===
===Exception Error During Installation===
When installing CentOS from a thumb drive, the installation can fail due to an unexpected exception error.  The usual cause is due to LUKS encryption on the hard drive.  In this case, the hard drive must be wiped prior to installation.
When installing CentOS from a thumb drive, the installation can fail due to an unexpected exception error.  The usual cause is due to LUKS encryption on the hard drive.  In this case, the hard drive must be wiped prior to installation.
Line 70: Line 160:


This package is necessary for nfs mounts.  Install it if it is not there.  This is another bizarre puppet hole.  Will look into it.
This package is necessary for nfs mounts.  Install it if it is not there.  This is another bizarre puppet hole.  Will look into it.
Also make sure the rpcbind service is running. 
service rpcbind status
If not:
service rpcbind start
chkconfig rpcbind on OR systemctl start rpcbind


===GPU Issues===
===GPU Issues===
Certain desktops have different GPUs and the drivers that are on puppet can often not work on a certain GPU.  For example, the GeForce 9800 GT is incompatible with the drivers that Puppet provides.  If you do need to install manually, remove packages::nvidia from the host config on foreman.  Usually, if you have incompatible drivers, you can check dmesg | grep NVRM to see if there are any messages concerning the drivers compatibility.  It will tell you what drivers are installed and what drivers the GPU actually needs.  Otherwise, you can run the following commands to find what drivers will be fitting for your Nvidia graphics card.   
Certain desktops have different GPUs and the drivers that are on puppet can often not work on a certain GPU.  For example, the GeForce 9800 GT is incompatible Nvidia driver 367.48 (which Puppet can sometimes install).  If you do need to install manually, remove packages::nvidia from the host config on foreman.  Usually, if you have incompatible drivers, you can check dmesg | grep NVRM to see if there are any messages concerning the drivers compatibility.  It will tell you what drivers are installed and what drivers the GPU actually needs.  Otherwise, you can run the following commands to find what drivers will be fitting for your Nvidia graphics card.  (These commands require elrepo to be installed.  Get elrepo here: http://elrepo.org/tiki/tiki-index.php) or follow elrepo rpm commands below
 
To get elrepo:
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-6-8.el6.elrepo.noarch.rpm
 
To find compatible nvidia drivers
  yum install nvidia-detect -OR- yum update nvidia-detect
  yum install nvidia-detect -OR- yum update nvidia-detect
  nvidia-detect
  nvidia-detect
Line 84: Line 186:
  NVRM:  this GPU.  Continuing probe...
  NVRM:  this GPU.  Continuing probe...


If the wrong set of drivers are installed on your system, run these commands to uninstall the current drivers and install the legacy ones for GeForce 9800 GT:
If the wrong set of drivers are installed on your system, run these commands to uninstall the current drivers
  yum remove kmod-nvidia
  yum remove <old drivers (find name of old drivers by doing yum list *nvidia* and looking at your installed packages>
  yum install kmod-nvidia-340xx
  yum install <name of package provided by nvidia-detect>


[[Category:Sysadmin]]
[[Category:Sysadmin]]
[[Category:Tutorials]]
[[Category:Tutorials]]

Latest revision as of 23:37, 19 November 2018

Installation

CentOS 6

Install centos, do the following commands as root:

wget http://yum.ucsf.bkslab.org/SETUP/desktop.sh
change hostname to desired hostname
sh desktop.sh
Answer yes if hostname is correct, answer yes to add an ifcfg network-scripts file.

Once it’s done, DON’T HIT ENTER Enter the information into foreman (new machine, etc) Then click on Infrastructure -> Smart Proxies Then click on certificates -> autosign entries -> new Enter in the host (ie mia.desktop.ucsf.bkslab.org) Then on the desktop hit enter You should see the cert get signed If you add a repo or something to the machine in foreman it will update automatically (eventually) but if you need it done right away, on the desktop type puppet agent -t


Important Note: The desktop.sh script has a deprecated path regarding the repodata. The script causes the scl.repo to have a PYCURL error because of its path. To fix it, go to /etc/yum.repos.d/scl.repo with vim and change the baseurl line to:

baseurl=http://yum/centos/6/sclo/$basearch/sclo/

Most GPU machines will require cuda and nvidia packages provided by foreman. Sometimes, those puppet scripts will not provide the correct packages/drivers (possible error with puppet script?). If so, then you must install them manually (see GPU Issues in 'Troubleshooting installation issues' below).

CentOS 7

This will be a work in progress.

Get the CentOS 7 ISO from: http://isoredirect.centos.org/centos/7/isos/x86_64/CentOS-7-x86_64-DVD-1804.iso On newer desktops, we have to turn off secure boot in the BIOS. There's a possibility the wired NIC won't show up. Once the install is complete, update the kernel. Updated kernels should be able to take advantage of the newer motherboard's features.

packages that centos6 had, that we will likely need:

zsh diffutils curl wget dos2unix unix2dos screen tmux vim-enhanced emacs fuse glusterfs autoconf automake binutils bison flex gcc gcc-c++ gettext libtool make patch pkgconfig redhat-rpm-config rpm-build byacc cscope ctags cvs diffstat doxygen elfutils gcc-gfortran git indent intltool patchutils rcs subversion swig systemtap ElectricFence ant babel bzr chrpath cmake cmake 3compat-gcc-34 compat-gcc-34-c++ compat-gcc-34-g77 cvs-inetd dejagnu expect gcc-gnat gcc-java gcc-objc gcc-objc++ imake jpackage-utils kdewebdev ksc libstdc++-docs mercurial mod_dav_svn nasm perltidy python-docs rpmdevtools rpmlint systemtap-sdt-devel systemtap-server

Libraries:

glibc glibc-devel glibc-static libffi libffi-devel openssl openssl-devel openssl-static libstdc++ libstdc++-devel libXScrnSaver (compatibility to run DOCK): compat-db47

Parallelism:

openmpi mpich-3.2 mpich-3.2-devel

Compression: zlib zlib-devel zlib-static libzip libzip-devel bzip2-libs bzip2-devel p7zip p7zip-plugins

Math Libs: atlas atlas-devel atlas-sse2 atlas-sse3 blas blas-devel lapack lapack-devel hdf5 hdf5-devel hdf5-static hdf5-openmpi hdf5-openmpi-devel hdf5-openmpi-static eigen3-devel

GUI: freetype freetype-devel libpng libpng-devel libpng-static tk tk-devel tcl tcl-devel cairo cairo-devel wxGTK-devel wxGTK3-devel libotf

DB Libraries: mysql mysql-devel sqlite sqlite-devel postgresql postgresql-devel libdb4 libdb4-devel libdb4-utils

Dependencies: boost boost-python boost-regex boost-devel openbabel freeglut freeglut-devel

Languages: R R-devel R-core R-core-devel R-java R-java-devel python-lockfile

XML: libxml2 libxml2-devel

Auth: sssd sssd-client nfs-utils nss-pam-ldapd

Printers: hplip hplip-gui hpijs


Repos to install:

  1. Install repos
yum install -y epel-release
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm


files to create: /etc/resolv.conf with

search desktop.ucsf.bkslab.org compbio.ucsf.edu ucsf.bkslab.org bkslab.org ucsf.edu
nameserver <alpha>
nameserver 128.218.254.10
nameserver 128.218.254.40

packages: autofs nvidia-detect (elrepo) kmod-nvidia (elrepo) openbabel (epel) libcurl-devel sqlite-devel (base) (for onenote)


Services to start: systemctl start oddjobd (to enable homedir creation when normal user logs in) autofs

Packages to install: libXScrnSaver - so nfs-soft chimera can run python36 - for modern python! python36-devel

autofs: auto.master: Create /nfs directory filled with symlinks that point to /mnt/nfs/* mounts.

/etc/auto.master /etc/auto.bks

Testing to be done: run software from nfs-soft (works: chimera) autofs create script to do all the above

Issues to fix: authentication/LDAP. /var/log/secure login entries show that pam_unix(gdm-password:auth) fails. pam_sss(gdm-password:auth) succeeds.

sudo authconfig --useshadow --passalgo=sha256 --enablelocauthorize --enablesssd --enablesssdauth --enablecache --enablepamaccess --enableldap --enableldapauth --enableldaptls --ldapserver="ldaps://ds.ucsf.bkslab.org" --ldapbasedn="dc=bkslab,dc=org" --disablekrb5 --enablemkhomedir --update


LDAP+TLS will not run unless you have a CA certificate in /etc/openldap/cacerts.

Troubleshooting installation issues

Kernel Panic During Installation/Boot on CentOS 6.8 USB Stick

A few particular computers are older and have an outdated BIOS. They don't support installation through USB stick. You can workaround this by installing CentOS through a CD or you can update the BIOS.

Exception Error During Installation

When installing CentOS from a thumb drive, the installation can fail due to an unexpected exception error. The usual cause is due to LUKS encryption on the hard drive. In this case, the hard drive must be wiped prior to installation.

You can tell if your computer has LUKS encryption by doing the command lsblk:

sh-4.1$ lsblk
NAME                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sr0                            11:0    1  1024M  0 rom   
sda                             8:0    0 149.1G  0 disk  
├─sda1                          8:1    0   500M  0 part  /boot
└─sda2                          8:2    0 148.6G  0 part  
 └─luks-1b6a5d85-ebb8-4e0f-bb90-911fbdf73956 (dm-0)
                             253:0    0 148.6G  0 crypt 
   ├─vg_lmfao-lv_root (dm-1) 253:1    0  59.8G  0 lvm   /
   ├─vg_lmfao-lv_swap (dm-2) 253:2    0   3.9G  0 lvm   [SWAP]     
   └─vg_lmfao-lv_home (dm-3) 253:3    0    85G  0 lvm   /home

To completely wipe a drive, attach a thumb drive with CentOS 6.8 LiveCD installed on it. This will allow you to execute commands on the hard drive that has a pre-existing OS on it. Once you have access to the terminal, become root and run this command on the harddrive you want to wipe:

[root@livecd centoslive]# dd if=/dev/zero of=/dev/<id_of_harddrive>

This will zero out all the data on the harddrive, completely wiping it. Use command lsblk to discover the name of the HDD you want to wipe. In the above lsblk command, I wiped out my 160GB SSD named sda so I executed the command:

[root@livecd centoslive]# dd if=/dev/zero of=/dev/sda

Authentication Failures when Logging in

Make sure you have several yum packages. Ensure that sssd is installed and that nss-pam-ldapd is installed. These are two packages that are getting excluded from the puppet runs for reasons I don't know yet. I'll look into it. For now, check with:

yum list sssd nss-pam-ldapd

If they are not installed, install them and run puppet again with:

puppet agent --test

Once done, the final step is to start/restart ldap. Do:

authconfig-tui

All the configurations should be in there already because puppet wrote the config files for ldap. Select next then okay to start/restart sssd with the ldap configurations.

NFS is Not Mounting Properly

Check that yum has installed the nfs-utils package with:

yum list nfs-utils

This package is necessary for nfs mounts. Install it if it is not there. This is another bizarre puppet hole. Will look into it.

Also make sure the rpcbind service is running.

service rpcbind status

If not:

service rpcbind start
chkconfig rpcbind on OR systemctl start rpcbind

GPU Issues

Certain desktops have different GPUs and the drivers that are on puppet can often not work on a certain GPU. For example, the GeForce 9800 GT is incompatible Nvidia driver 367.48 (which Puppet can sometimes install). If you do need to install manually, remove packages::nvidia from the host config on foreman. Usually, if you have incompatible drivers, you can check dmesg | grep NVRM to see if there are any messages concerning the drivers compatibility. It will tell you what drivers are installed and what drivers the GPU actually needs. Otherwise, you can run the following commands to find what drivers will be fitting for your Nvidia graphics card. (These commands require elrepo to be installed. Get elrepo here: http://elrepo.org/tiki/tiki-index.php) or follow elrepo rpm commands below

To get elrepo:

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-6-8.el6.elrepo.noarch.rpm

To find compatible nvidia drivers

yum install nvidia-detect -OR- yum update nvidia-detect
nvidia-detect

A sample message in dmesg concerning incompatible graphics card and drivers (the GPU wants 340.xx legacy drivers but the 367.48 driver is installed):

[root@beatles ~]# dmesg | grep NVRM
NVRM: The NVIDIA GeForce 9800 GT GPU installed in this system is
NVRM:  supported through the NVIDIA 340.xx Legacy drivers. Please
NVRM:  visit http://www.nvidia.com/object/unix.html for more
NVRM:  information.  The 367.48 NVIDIA driver will ignore
NVRM:  this GPU.  Continuing probe...

If the wrong set of drivers are installed on your system, run these commands to uninstall the current drivers

yum remove <old drivers (find name of old drivers by doing yum list *nvidia* and looking at your installed packages>
yum install <name of package provided by nvidia-detect>