Hardware
Desktop machines
All desktop machines run OpenSuSE. Installation instructions for OpenSuSE 13.1.
Clusters
Minotaur
- 38 nodes, each containing two 4-core processors (304 cores total). 8 GB memory per node.
Processor type: Intel Xeon E5472, 3.0 GHz. - Jobs are scheduled via Torque/Maui. Notes on Torque.
Hydra
- 60 nodes, each containing two 6-core processors (720 cores total). 12 GB memory per node.
8 nodes (queue "fast", nodes h001-h008) have Intel Xeon X5690 3.47 GHz processors.
52 nodes (queue "default", nodes h009-h060) have Intel Xeon E5645 2.40 GHz processors. - Jobs are scheduled via Torque/Maui. Notes on Torque.
- General Usage of Hydra
Quest
- Jobs are scheduled via Torque/Moab. Notes on Torque.
- General Usage of Quest
Disk space, backups, and RAID storage
Disk space allocations and nightly backups
Each user has a home directory located on ariadne. This home directory is exported to all desktop machines, so that you see the same home filesystem on each machine. The drive is protected against hardware failure via a [RAID-1] setup. Furthermore, each night all new or modified files on /home are written to tape (located in ariadne). This makes it important not to store temporary data in your home folder, as it would quickly fill up the tape. Since users tend to forget this, a quota system has been enabled on ariadne, restricting each user to 15 GB. To check how much space you are using log on to ariadne and issue the command
quota -s
In addition, each user has significant additional storage on the scratch partitions. These drives are located in the different desktop machines and protected via RAID-1, but backups are your own responsibility. Note that these partitions are generally only mounted on the desktop machine that contains the corresponding drives. If you need a partition to be exported to a different machine, please ask.
Changing the nightly backup tape
- Press eject button on tape drive in ariadne.
- Take the tape cartridge out of the drive and put it in its box (should be on top of ariadne). Label the box. Give to Erik.
- Insert cleaning tape (on top of ariadne). It will work for less than a minute and then eject automatically.
- Put cleaning tape back in box on top of ariadne.
- Insert new DDS tape (find in cabinet). Leave empty box on top of ariadne.
- Erik: Update settings in /usr/local/lib/backup, namely position and tapenumber; update logfile.
Recovering data from the nightly backup tape
Log files of all nightly backup tapes are located on ariadne, in /usr/local/lib/backup. For privacy reasons, these logfiles are only accessible to root. Once the proper file to be recovered has been identified, insert the corresponding tape into the drive on ariadne and follow these steps (all to be executed as root):
- cd /
(if you change to a different directory, the recovered file will be placed relative to this directory) - /usr/local/bin/tape-rewind
(or mtst -f /dev/nst0 rewind) - mtst -f /dev/nst0 fsf <position>
(see the contents file in /usr/local/lib/backup for the position number) - tar xzvf /dev/nst0 <full_file_name_without_leading_slash>
This step won't work unless you omit the leading slash; also note that you can specify multiple files, separated by spaces. The 'z' option is necessary because all nightly backups are compressed. For wildcards, use --wildcards and escape '*' and '?'. For example: tar -x --wildcards -zvf /dev/nst0 \*datafiles\* - /usr/local/bin/tape-rewoffl
(or mtst -f /dev/nst0 rewoffl)
Archiving data using the LTO tape drive
Checking RAID status
- Hydra
- OS is on software RAID (which spans /dev/sda and /dev/sdb). An overview is obtained via
cat /proc/mdstat
Detailed information via
mdadm --detail /dev/mdX
where X = 1, 5, 6, 7. Also see Setting up e-mail notifications for Linux Software RAID. - /home
- /archive
- OS is on software RAID (which spans /dev/sda and /dev/sdb). An overview is obtained via
- Minotaur
Web interface. Log in as root to head node and use opera. - Ariadne
RAID-5 controller with 4 drives. Status can be checked by interrogating the controller:/opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aALL | less
In the 'Device Present' section, it is reported if any drives are critical or have failed, and what the state of the RAID is. More detailed information can also be found via
/opt/MegaRAID/MegaCli/MegaCli64 -LDPDInfo -aAll | less
Directly at the beginning (under 'Adapter #0') it should report 'State: Optimal' - Desktop machines, except pelops
Hardware RAID-1. The RAID status is reported upon reboot of a machine. Press Ctrl-C (when prompted) to enter the configuration utility. From within Linux, use (as root):mpt-status -i 0 mpt-status -i 2
The second command only applies to machines with a second set of hard drives (achilles, agamemnon, nestor, poseidon)
To allow regular users to verify the RAID status, the mpt-status has been added to sudo:sudo mpt-status -i 0 sudo mpt-status -i 2
- Pelops: Software RAID (for OS and scratch partitions). See Hydra.
Printers
There are two black and white laser printers (PS-1 and PS-2) in the lab supporting double sided printing. Currently, PS-1 is not working properly. To access printer (PS-2) wirelessly, add a new printer on your OS with IP address "luijten-ps2.ms.northwestern.edu". The default protocol should be "Line Printer Daemon -LPD".
Scanner
UPS
All our UPS units are manufactured by APC, and supported via apcupsd. Installation & configuration instructions:
- Make sure sure the apcupsd package is installed, see Installation instructions for OpenSuSE 13.1.
- Connect UPS unit to USB port of the corresponding machine.
- In /etc/apcupsd/apcupsd.conf edit these lines:
UPSCABLE usb UPSTYPE usb
Also, comment out the DEVICE line.
- From command line, do
chkconfig apcupsd on
- Start the daemon manually:
apcupsd
- Test it:
apcaccess
This should produce extensive output regarding the UPS unit.
(Note: this command also works for regular users; in that case use /usr/sbin/apcaccess.)