HPC @ SAO
S.G. Korzennik - SAO HPC Analyst
hpc@cfa
February 2013
SGK (hpc@cfa) HPC @ SAO February 2013 1 / 33
HPC @ SAO S.G. Korzennik - SAO HPC Analyst hpc@cfa February 2013 - - PowerPoint PPT Presentation
HPC @ SAO S.G. Korzennik - SAO HPC Analyst hpc@cfa February 2013 SGK ( hpc@cfa ) HPC @ SAO February 2013 1 / 33 Outline Outline Results of the survey What is H YDRA How to use H YDRA Answer to some survey questions
hpc@cfa
SGK (hpc@cfa) HPC @ SAO February 2013 1 / 33
Outline
◮ Results of the survey ◮ What is HYDRA ◮ How to use HYDRA ◮ Answer to some survey questions ◮ Discussion: h/w, s/w, other
SGK (hpc@cfa) HPC @ SAO February 2013 2 / 33
Introduction
SGK (hpc@cfa) HPC @ SAO February 2013 3 / 33
Introduction
SGK (hpc@cfa) HPC @ SAO February 2013 4 / 33
What is HYDRA
◮ HYDRA is a Linux based Beowulf cluster. ◮ Started at SAO a while back, managed by the CF. ◮ Moved to the Smithsonian’s Data Center, in Herndon, VA. ◮ Managed by SI’s Office of Information Technology Operations (OITO/OCIO). ◮ Has grown from a 200+ to a 3000+ core machine. ◮ Has become an SI-wide resource. ◮ The cluster is managed by DJ Ding, sys-admin (in Herndon, VA.) ◮ Additional support for SAO: HPC analyst. (0.25 FTE)
SGK (hpc@cfa) HPC @ SAO February 2013 5 / 33
What is HYDRA: Hardware
◮ 296 compute nodes, distributed over 10 racks. ◮ Total of 3,116 compute cores (CPUs). ◮ All the nodes are interconnected on regular Ethernet (1 Gbps). ◮ Some nodes are on InfiniBand (40 Gbps) fabric (856 cores in IB). ◮ Some 40 TB of public disk space (56% full). ◮ Comparable user specific disk space (indiv. purchase). ◮ A parallel file system (60 TB), leveraging the IB fabric, is in the works. CF/HPC web page http://www.cfa.harvard.edu/cf/services/cluster
The hardware config is described in the HPC Wiki https://www.cfa.harvard.edu/twiki/bin/view/HPC/WebHome
SGK (hpc@cfa) HPC @ SAO February 2013 6 / 33
What is HYDRA: Hardware
SGK (hpc@cfa) HPC @ SAO February 2013 7 / 33
What is HYDRA: Hardware
SGK (hpc@cfa) HPC @ SAO February 2013 8 / 33
What is HYDRA: Hardware
SGK (hpc@cfa) HPC @ SAO February 2013 9 / 33
What is HYDRA: Software
◮ The cluster is a Linux-based distributed cluster, running ROCKS ◮ Uses the GRID ENGINE queuing system (aka SGE, OGE or GE). ◮ Access to the cluster is via 2 login nodes:
◮ hydra.si.edu, or ◮ hydra-login.si.edu
◮ From one of the login nodes:
◮ you submit jobs via the queuing system: qsub ◮ all jobs run in batch mode ◮ You do not start jobs interactively on any of the compute nodes, ◮ instead, you submit a script and request resources ◮ the GE selects the compute nodes and starts your job on that/these node(s).
◮ The login nodes are for normal interactive use like editing, compiling, script writing, short
SGK (hpc@cfa) HPC @ SAO February 2013 10 / 33
What is HYDRA: Software
◮ Compilers (3)
◮ GNU: gcc, f77 ◮ Intel, icc, icpc, ifort, including the Cluster Studio ◮ PGI: pgcc, phgc+, pgf77, pgf90 (Portland Group),
◮ Libraries
◮ MPI for all compilers, including INFINIBAND support ◮ Math libraries that come w/ compilers ◮ AMD math libraries
◮ Packages
◮ IDL, including 128 run-time licenses, GDL ◮ IRAF (v1.7)
If you need some specific software or believe that it would benefit the user community to have some additional sofware, let me know (hpc@cfa).
SGK (hpc@cfa) HPC @ SAO February 2013 11 / 33
What is HYDRA: Documentation
◮ Primers
◮ How to compile ◮ How to submit jobs ◮ How to monitor your jobs ◮ How to use IDL & GDL on the cluster ◮ How to copy files to/from cluster and what disk(s) to use
◮ FAQs
◮ Queues ◮ Error Messages ◮ Compilers ◮ Disk Use ◮ S/W packages
◮ Man Pages for Grid Engine Commands
https://www.cfa.harvard.edu/twiki/bin/view/HPC/WebHome SGK (hpc@cfa) HPC @ SAO February 2013 12 / 33
HYDRA: Connect, Support, Disks, Copy, Compile and Submit
◮ Connect to HYDRA
◮ Must ssh to one of HYDRA login nodes from a trusted host:
◮ Accounts and directories separate from CF/HEA ◮ SD 931 enforced (incl. password expiration)
◮ Support for HYDRA
◮ DJ Ding - sys-admin (OTIO/OCIO) ◮ SGK - HPC analyst (SAO) ◮ HPCC-L on si-listserv.si.edu (mailing list on SI’s list-serv) ◮ Do not contact CF/HEA support (except for requesting an account) ◮ Contact the OCIO Help Desk (OCIOHelpDesk@si.edu) for password reset or other access
◮ Configuration is different from CF/HEA systems ◮ Must customize ∼/.cshrc , ∼/.bash_profile & ∼/.bashrc ◮ Look under ∼hpc SGK (hpc@cfa) HPC @ SAO February 2013 13 / 33
HYDRA: Connect, Support, Disks, Copy, Compile and Submit
◮ Disks on HYDRA
◮ /home - is not for data storage ◮ Public space, first come first served basis ◮ /pool/cluster* (34 TB over 7 filesys, half used, no scrubber) ◮ /pool/temp* (8 TB over 2 filesys, 10% used, 14 day scrubber) ◮ /pool/cluster* is not for long term storage ◮ Parallel file system: under development (PVFS: 60 TB) ◮ User specific storage: possible ◮ Local disk space (on compute nodes): uneven & discouraged ◮ Not cross-mounted to CfA.
◮ Copy to/from HYDRA
◮ Use scp, sftp or rsync ◮ Use rsync --bwlimit=1000 for large transfer (> 10 GB):
rsync --bwlimit=1000 -azv * hydra:/pool/cluster2/user/mydata/.
◮ Serialize or limit the number of heavy I/O: cp, mv, rm, scp, rsync...
Public key (ssh-keygen) OK: no passwd w/ ssh or rsync
SGK (hpc@cfa) HPC @ SAO February 2013 14 / 33
HYDRA: Connect, Support, Disks, Copy, Compile and Submit
◮ Compilers (3): GNU, PGI, Intel
◮ Cannot mix and match compilers/libraries ◮ Same 3 compilers available on CF-managed machines ◮ MPI and IB support avail. for all 3, can be tricky ◮ OpenMP avail. for all 3 (de facto h/w limit)
◮ Submit jobs
◮ qsub
◮ qstat : monitor job(s) ◮ qalter: change job resource(s) ◮ qdel
◮ qconf : query queue(s) configuration ◮ qacct : query queue(s) accounting (used resources)
◮ MPI, IB & OpenMP: must use
◮ appropriate compiler flags and libs ◮ corresponding execs, scripts and queues SGK (hpc@cfa) HPC @ SAO February 2013 15 / 33
HYDRA: Trivial example
hydra% pwd /home/user/test hydra% cat hello.c #include <stdio.h> int main() { printf("hello world!\n"); } hydra% pgcc -o hello hello.c hydra% cat hello.job # using csh syntax (default) echo hello.job started ‘date‘ in queue $QUEUE \ with jobid=$JOB_ID on ‘hostname‘ uptime pwd ./hello echo hello.job done ‘date‘ hydra% ls hello hello.c hello.job
SGK (hpc@cfa) HPC @ SAO February 2013 16 / 33
HYDRA: Trivial example
hydra% qsub -cwd -j y -o hello.log -N hello hello.job Your job 4539322 ("hello") has been submitted hydra% qstat -u user job-ID prior name user state submit/start at queue slots ja-task-ID
qw 01/10/2013 18:01:40 1 hydra% qstat -u user job-ID prior name user state submit/start at queue slots ja-task-ID
01/10/2013 18:01:53 sTz.q@compute-1-29.local 1 hydra% ls hello hello.c hello.job hello.log hydra% cat hello.log hello.job started Thu Jan 10 18:01:53 EST 2013 in queue sTz.q with jobid=4539322 on compute-1-29.local 18:01:53 up 211 days, 29 min, 0 users, load average: 0.00, 0.00, 0.00 /home/user/test hello world! hello.job done Thu Jan 10 18:01:54 EST 2013 SGK (hpc@cfa) HPC @ SAO February 2013 17 / 33
HYDRA: Too Many Choices, Queues and Limits
◮ Examples on the Wiki and in ∼hpc/tests ◮ Typical job:
◮ Job array: slew of cases using same script
◮ Optimization, namespace, I/O load, throughput & scalability ◮ Monitor progress ⇒ trade off ◮ Best to request resouce(s) rather than to hard-wire name(s) ◮ Consider checkpointing
SGK (hpc@cfa) HPC @ SAO February 2013 18 / 33
HYDRA: Too Many Choices, Queues and Limits
◮ Request resources with -l (repeat as many times as needed) request with to get memory use limit
2GB of memory use virtual memory use limit
2GB of virtual memory use host with free memory
2GB of free memory cpu time limit
two hour cpu limit elapsed time limit
two hour real-time limit ◮ Embedded directives hydra% qsub -cwd -j y -o hello.log -N hello \
can be simplified by embedding them near the top of the job file # using csh syntax (default) # #$ -cwd -j y -o hello.log #$ -N hello #$ -l s_cpu=48:00:00 -l s_data=2G # echo hello.job started ‘date‘ in queue $QUEUE \ with jobid=$JOB_ID on ‘hostname‘ ... and using, for example hydra% qsub -l s_cpu=24:00:00 -N short_hello hello.job Flags added on the qsub command line overwrite the embedded value(s).
SGK (hpc@cfa) HPC @ SAO February 2013 19 / 33
HYDRA: Too Many Choices, Queues and Limits
◮ Examples:
◮ Compiler, script and which mpirun must match ◮ More examples and details on the Wiki and in ∼hpc/tests
SGK (hpc@cfa) HPC @ SAO February 2013 20 / 33
HYDRA: Too Many Choices, Queues and Limits
hydra% cd hpc/tests/mpi/pgi/tcp hydra% cat hello.c /* MPI hello world C Example */ #include <stdio.h> #include <mpi.h> int main (argc, argv) int argc; char *argv[]; { int rank, size; MPI_Init (&argc, &argv);/* starts MPI */ MPI_Comm_rank (MPI_COMM_WORLD, &rank);/* get current process id */ MPI_Comm_size (MPI_COMM_WORLD, &size);/* get number of processes */ printf( "Hello world from process %d of %d\n", rank, size ); MPI_Finalize(); return 0; }
SGK (hpc@cfa) HPC @ SAO February 2013 21 / 33
HYDRA: Too Many Choices, Queues and Limits
hydra% cat Makefile # # pgi location PGI = /share/apps/pgi # pgi version PGIx = $(PGI)/linux86-64/12.5 # mpi location MPICH = $(PGIx)/mpi/mpich # # flags CFLAGS = -O MFLAGS = -I$(MPICH)/include -I$(MPICH)/include/f90choice # # compiler/linker CC = pgcc $(CFLAGS) $(MFLAGS) $(IFLAGS) MPICC = $(MPICH)/bin/mpicc $(CFLAGS) # # --------------------------------------------------------------------------- # hello: hello.o $(MPICC) -o $@ hello.o hydra% make hello pgcc -I/share/apps/pgi/linux86-64/12.5/mpi/mpich/include \
/share/apps/pgi/linux86-64/12.5/mpi/mpich/bin/mpicc -o hello hello.o SGK (hpc@cfa) HPC @ SAO February 2013 22 / 33
HYDRA: Too Many Choices, Queues and Limits
hydra% cat hello.job # #$ -cwd -j y #$ -N hello -o hello.log #$ -pe mpich 8 # echo ‘date‘ job=$JOB_NAME started on ‘hostname‘ in $QUEUE with id=$JOB_ID echo NSLOTS=$NSLOTS echo $TMPDIR/machines cat -n $TMPDIR/machines set MPICH = $PGI/linux86-64/12.5/mpi/mpich $MPICH/bin/mpirun -np $NSLOTS -machinefile $TMPDIR/machines hello echo = ‘date‘ done hydra% qsub hello.job Your job 4580421 ("hello") has been submitted hydra% qstat -u hpc job-ID prior name user state submit/start at queue slots ja-task-ID
qw 01/14/2013 15:34:31 8 hydra% qstat -u hpc job-ID prior name user state submit/start at queue slots ja-task-ID
r 01/14/2013 15:34:38 sTN.q@compute-3-32.local 8 SGK (hpc@cfa) HPC @ SAO February 2013 23 / 33
HYDRA: Too Many Choices, Queues and Limits
hydra% cat hello.log
compute-3-32 compute-3-32 compute-3-32 compute-3-32 compute-3-32 compute-5-10 compute-5-10 compute-5-10 Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. + Mon Jan 14 15:34:39 EST 2013 job=hello started on compute-3-32.local in sTN.q with id=4580421 NSLOTS=8 /tmp/4580421.1.sTN.q/machines 1 compute-3-32 2 compute-3-32 3 compute-3-32 4 compute-3-32 5 compute-3-32 6 compute-5-10 7 compute-5-10 8 compute-5-10 Hello world from process 7 of 8 Hello world from process 5 of 8 Hello world from process 3 of 8 Hello world from process 1 of 8 Hello world from process 6 of 8 Hello world from process 4 of 8 Hello world from process 2 of 8 Hello world from process 0 of 8 = Mon Jan 14 15:34:44 EST 2013 done SGK (hpc@cfa) HPC @ SAO February 2013 24 / 33
HYDRA: Too Many Choices, Queues and Limits
hydra% qstat -g c CLUSTER QUEUE CQLOAD USED RES AVAIL TOTAL aoACDS cdsuE
0.09 8 3112 3104 lTN.q 0.09 124 2876 3112 112 lTNi.q 0.13 856 856 lTz.q 0.09 72 2928 3112 112 mTN.q 0.09 3000 3112 112 mTNi.q 0.13 856 856 mTz.q 0.09 1 3000 3116 116 sTN.q 0.09 3072 3112 40 sTNi.q 0.13 856 856 sTz.q 0.09 2 3096 3116 20 uTz.q 0.09 81 2928 3116 108 xPVFS.tq 0.00 96 96 ◮ all.q: always disabled ◮ *.tq : test queue(s), restricted access ◮ ∼hpc/sbin/q+: PERL wrapper around qstat, try ∼hpc/sbin/q+ -help
SGK (hpc@cfa) HPC @ SAO February 2013 25 / 33
HYDRA: Too Many Choices, Queues and Limits
hydra% qconf -srqs { name max_user_slots description Limit slots/user for all queues enabled TRUE limit users {*} to slots=1400 } { name max_z_slots description Limit slots/user in serial queues enabled TRUE limit users {*} queues {sTz.q} to slots=768 limit users {*} queues {mTz.q} to slots=384 limit users {*} queues {lTz.q} to slots=192 limit users {*} queues {uTz.q} to slots=48 } { name max_xTN_slots description Limit slots/user for non-IB parallel queues \ (small/medium/long-T N) enabled TRUE limit users {*} queues {sTN.q} to slots=768 limit users {*} queues {mTN.q} to slots=384 limit users {*} queues {lTN.q} to slots=192 } { name max_xTNi_slots description Limit slots/user for IB parallel queues (small/medium/long-T Ni) enabled TRUE limit users {*} queues {sTNi.q} to slots=384 limit users {*} queues {mTNi.q} to slots=192 limit users {*} queues {lTNi.q} to slots=96 } SGK (hpc@cfa) HPC @ SAO February 2013 26 / 33
HYDRA: Too Many Choices, Queues and Limits
hydra% qconf -sq \?Tz.q | egrep ’name|_rt|_cpu |_vmem’ hydra% qconf -sq \?TN.q | egrep ’name|_rt|_cpu |_vmem’ qname lTz.q qname lTN.q s_rt 864:00:00 s_rt 864:00:00 h_rt 864:15:00 h_rt 864:15:00 s_cpu 432:00:00 s_cpu 432:00:00 h_cpu 432:15:00 h_cpu 432:15:00 s_vmem 16G s_vmem 16G h_vmem 20G h_vmem 20G qname mTz.q qname mTN.q s_rt 72:00:00 s_rt 72:00:00 h_rt 72:15:00 h_rt 72:15:00 s_cpu 36:00:00 s_cpu 36:00:00 h_cpu 36:15:00 h_cpu 36:15:00 s_vmem 16G s_vmem 16G h_vmem 20G h_vmem 20G qname sTz.q qname sTN.q s_rt 6:00:00 s_rt 6:00:00 h_rt 6:15:00 h_rt 6:15:00 s_cpu 3:00:00 s_cpu 3:00:00 h_cpu 3:15:00 h_cpu 3:15:00 s_vmem 16G s_vmem 16G h_vmem 20G h_vmem 20G qname uTz.q s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_vmem 16G h_vmem 20G
◮
CPU: 12× progression: 3h, 36h, 18d
◮
R/T = 2 × CPU SGK (hpc@cfa) HPC @ SAO February 2013 27 / 33
More on the Wiki
◮ More stuff and more details on Wiki. ◮ RTFM, but . . . ◮ . . . if that fails, don’t hesitate to email me (hpc@cfa). ◮ Limits can be reconfigured, if needed . . . ◮ . . . while maintaining fair use. ◮ ???
SGK (hpc@cfa) HPC @ SAO February 2013 28 / 33
Answers to Some Survey Questions
Q7 If you have considered using HYDRA, why have you not used it yet?
Ibidem.
Indeed.
processing ALMA data. Eventually.
If it will run under Linux, go for it.
space available. In my quick look through the documentation, I didn’t see what was available for an individual user. A lot is now documented on the Wiki.
Try it. Hydra is stable, documented and supported.
SGK (hpc@cfa) HPC @ SAO February 2013 29 / 33
Answers to Some Survey Questions
Q8 If you have stopped using HYDRA, why?
now only useful for embarrassingly parallel computing (multiple serial jobs). Limits are reasonable and can be increased upon request.
much faster. However, XSEDE allocations process can be a hassle and I really want to get code working on hydra efficiently, if it is possible, so can use these available ’in house’, i.e. SI resources. Zombie problem remains a mystery, seen only by a couple of users.
SGK (hpc@cfa) HPC @ SAO February 2013 30 / 33
Answers to Some Survey Questions
Q9 What would you need to use HYDRA or make better use of HYDRA?
Wiki? Wiki!
if 3rd party software packages can be made available (e.g., CIAO). Wiki? Wiki! CIAO: if you can build it, you can use it.
Just go for it.
Elaborate?
Elaborate?
need significant storage and space for analysis of very large simulation data sets. Not sure if that is already part of the system. See above. Disk space is available, could be expanded.
a short time or a small number of cpus for a long time. I realise that this is due to prevent users to monopolise the system, but it should be possible to exceed the queues for a limited amount of time. See above. If efficient at 256 in the mTN.q (36h), and limited by the 192 in lTN.q (18d) things can be adjusted.
SGK (hpc@cfa) HPC @ SAO February 2013 31 / 33
Answers to Some Survey Questions
Q10 What would you like to see as improvements to HYDRA or some other HPC resource?
Elaborate.
See above : available in the mTN.q...
first, then download them to hydra. Can we securely and directly download/upload data from outside to HYDRA? Not a problem: pull the data from HYDRA.
SGK (hpc@cfa) HPC @ SAO February 2013 32 / 33
The End/Discussion SGK (hpc@cfa) HPC @ SAO February 2013 33 / 33