There are three kinds of compute nodes in the cluster: General Computing Nodes, GPU Nodes, and Petascale Data Analysis Facility (PDAF) Nodes. The current specifications for each type of node are as follows:
| General Computing Nodes | |
|---|---|
| Processors | Dual-socket, 8-core, 2.6GHz Intel Xeon E5-2670 (Sandy Bridge) |
| Memory | 64GB (4GB/core) (128GB memory optional) |
| Network | 10GbE (QDR InfiniBand optional) |
| Hard Drive | 500GB onboard (second hard drive or SSD optional) |
| Warranty | 3-years |
| GPU Nodes | |
|---|---|
| Host Processors | Dual-socket, Intel Xeon E5-26xx (clock rate and core count TBD) (Sandy Bridge) |
| GPUs | NVIDIA GeForce GTX680 or Tesla K20 (count TBD) |
| Memory | 32GB |
| Network | 10GbE (QDR InfiniBand optional) |
| Hard Drive | 500GB + 240GB SSD |
| Warranty | 3-years |
| PDAF (shared resource; pay-as-you-go only) | |
|---|---|
| Processors | 8-socket, 4-core AMD Shanghai Opteron |
| Memory | 512 GB |
| Network | 10 GbE |
(RCI will annually update the hardware choices for general computing and GPU condo purchasers, to stay abreast of technology/cost advances.)
Nodes with the QDR InfiniBand (IB) option will plug into 32-port IB switches, allowing up to 512 cores to communicate at full bisection bandwidth for low latency, parallel computing.
TSCC users will receive 100GB of home file storage and shared access to the >200TB Data Oasis Lustre high performance file system. (There is a 90-day purge policy on Data Oasis for normal usage.) Additional persistent storage can be mounted from lab file servers over the campus network or can be purchased from SDSC.
Each year, condo cluster participants receive an amount of cluster runtime proportional to the capabilities of their purchased nodes. For example, a participant that purchases eight general computing nodes will receive just under 1.1 M Service Units (1SU=1core-‐hour) of time, which represents 24x365 usage of 128 cores, allowing for 3% maintenance downtime. These core-‐hours can be used any time during the year on any of the computing nodes; however unused core-‐hours by condo participants expire at the end of each year.
Condo participants' jobs that require a number of cores less than or equal to their purchased nodes are guaranteed to start within eight hours of submission and can run for an unlimited amount of time. Jobs that extend into the cluster hotel nodes have a 72-‐hour time limit and share a queue with the jobs of pay-‐as-‐you-‐go users, while jobs that extend to other participants’ condo nodes have an eight-‐hour time limit. Condo participants may submit gleaning jobs to run on idle computing nodes. These jobs are not charged against the submitter's SU balance, but they may be terminated at any time by the scheduler if the nodes where they are running are needed to run higher-‐priority jobs.
Because the capabilities and purpose of the GPU nodes differ significantly from the general computing nodes, SUs received for contributed GPU nodes and general computing nodes cannot be interchanged.
Pay-‐as-‐you-‐go (hotel) users’ jobs can only run on the hotel nodes. Initially there will be 40 general computing nodes (6,400cores); additional nodes may be added based on demand. Hotel nodes are configured with 64 GB of memory and an IB interface. The general computing nodes will be allocated per-‐core, allowing up to 16 jobs to run on each node simultaneously. Hotel jobs can also run on the large-‐memory PDAF nodes; the PDAF nodes will allow a maximum of two jobs per node (i.e.,256GB or 512GB of memory), meaning that use of either 16 or 32 cores will be charged to each job.
Participants who contribute to the cluster will have priority access to the nodes that they contribute. In addition, they have the option to run jobs on additional cluster nodes when available, effectively increasing their computing capability and flexibility. The TSCC allows participating researchers access to additional cycles during times of peak use, through a model that pools computing resources. During times of intense research, this approach provides participants with far more computational power than they would have if running only on their own hardware. It also supports running jobs at a higher core count than when restricted to their own nodes.
The condo purchaser has the option of taking possession of their nodes at any time; however once equipment is removed, it cannot later be returned to TSCC.
After the expiration of the three-year warranty, condo participants may leave their contributed nodes in the TSCC for another year, as long as the nodes remain operational. At the end of four years, participants must take possession of or surplus their equipment.
The TSCC will run the CentOS v6.2 operating system. PGI and Intel compilers will be available, as will mvapich2 and openmpi. Over 50 additional software applications and libraries will be installed on the system, and the administrators will be happy to work with researchers to extend this set as time/costs allow.
This is a partial list, based on current software from Triton Resource. The TSCC software installation is being assembled and this information will be updated at that time with specifics for installed location, version, topic area, license type, web site documentation and host node type.
| Package | Topic Area | Version | License Type | Package Home Page | User Install Location | Installed on: (L)ogin, (C)ompute, (B)oth |
|---|---|---|---|---|---|---|
| APBS (Adaptive Poisson-Boltzmann Solver) | Bioinformatics | 1.3 | BSD, MIT | APBS Home Page | /opt/apbs | C |
| BEAST | Bioinformatics, Phylogenetics | 1.7.4 | GNU LGPL | BEAST Home Page | /opt/beast | C |
| BLAT | Bioinformatics, Genetics | 35 | Free Non-commercial | BLAT User Guide | /opt/biotools/blat | B |
| bbFTP | Large Parallel File Transfer | 3.2.0 | GNU GPL | bbFTP Home Page | /opt/bbftpc | B |
| Bowtie Short Read Aligner | Bioinformatics | 0.12.9 | GNU GPL | Bowtie Home Page | /opt/biotools/bowtie | B |
| Burrows-Wheeler Aligner (BWA) | Bioinformatics | 0.6.2 | GNU GPL | BWA Home Page | /opt/biotools/bwa | B |
| Cilk | Parallel Programming | 5.4.6 | GNU GPL | Cilk Home Page | /opt/cilk | B |
| CP2K | Physics, Chemistry, Biology | 2.3 | GPL | CP2K Home Page | /opt/cp2k | B |
| CPMD | Molecular Dynamics |
3.15.3 |
Free for noncommercial research | CPMD Home Page | /opt/cpmd/bin | B |
| DDT | Graphical Parallel Debugger | 3.2 | Licensed | DDT Home Page | /opt/ddt | B |
| Package | Topic Area | Version | License Type | Package Home Page | User Install Location | Installed on: (L)ogin, (C)ompute, (B)oth |
| FFTW | General | 3.3.3 | GNU GPL | FFTW Home Page | /opt/fftw/3.2.1/intel –or– /opt/fftw/3.2.1/pgi –or– /opt/fftw/3.2.1/gnu |
B |
| FPMPI | MPI Programming | 2.2 | Licenced | FPMPI Home Page | TBD | TBD |
| GAMESS | Chemistry | 5.2012 | No-cost Site License | GAMESS Home Page | /opt/gamess | B |
| Genome Analysis Toolkit (GATK) | Bioinformatics | 2.3.9 | BSD Open Source | GATK Home Page | /opt/biotools/GenomeAnalysisTK | B |
| IDL | Visualization | 8.2.1 | Licensed | IDL Home Page | /opt/idl | B |
| IPython | Parallel Computing | 0.13.1 | BSD | IPython Home Page | /opt/ipython | B |
| matplotlib | Python Graphing Library | 1.1.1 | PSF (Python Software Foundation) | matplotlib Home Page | /opt/scipy/lib64/python2.4/site-packages | B |
| LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) | Molecular Dynamics Simulator | 12Jan13 | GPL | LAMMPS Home Page | /opt/lammps | C |
| MATLAB | Parallel Development Environment | 7.9 | Licensed | MATLAB Home Page | /home/beta/matlab.2011a –and– /home/beta/matlab_server_2012b |
B |
| Package | Topic Area | Version | License Type | Package Home Page | User Install Location | Installed on: (L)ogin, (C)ompute, (B)oth |
| openmpi | Parallel Library | 1.6.3 | Generic | Open MPI Home Page | /opt/openmpi/intel/mx –or– /opt/openmpi/pgi/mx –or– /opt/openmpi/gnu/mx |
B |
| NAMD | Molecular Dynamics, BioInformatics | 2.9 | Non-Exclusive, Non-Commercial Use | NAMD Home Page | /opt/namd | C |
| NCO | NetCDF Support | 4.2.3 | Generic | NCO Home Page | /opt/nco/intel –or– /opt/nco/pgi –or– /opt/nco/gnu |
B |
| NetCDF | General | 4.2.1.1 | Licensed (free) | NetCDF Home | Environment Module | B |
| NoSE (Network Simulation Environment) | Networking | 1.2.1 | GNU GPL | NoSE Home | Perl Module | B |
| NumPy (Numerical Python) | Scientific Calculation | 1.6.2 | BSD | NumPy Home | /opt/scipy/lib64/python2.4/site-packages | B |
| NWChem | Chemistry | 6.1.1 | EMSL (free) | NWChem Home Page | /opt/nwchem | B |
| PyFITS | Astrophysics | 3.1.1 | BSD | PyFITS Home Page | /opt/scipy/lib64/python2.4/site-packages | B |
| Python | General Scripting | 2.7 | BSD | Python Home Page | /opt/python/bin | B |
| pytz | Python TimeZone Module | 2012j | MIT | PyTZ Home Page | /opt/scipy/lib64/python2.4/site-packages | B |
| R | Statistical Computing and Graphics | 2.15.2 | GNU GPL | R Home Page | /opt/R/bin | C |
| SciPy (Scientific Python) | Scientific Computing | 0.11.0rc1 | BSD | SciPy Home Page | /opt/scipy/lib64/python2.4/site-packages | B |
| TAU | Tuning and Analysis Utilities | 2.22.p1 | GNU GPL | TAU Home Page | /opt/tau/intel –or– /opt/tau/pgi –or– /opt/tau/gnu |
B |
| Package | Topic Area | Version | License Type | Package Home Page | User Install Location | Installed on: (L)ogin, (C)ompute, (B)oth |
System Software |
||||||
| Package | Topic Area | Version | License Type | Package Home Page | User Install Location | Installed on: (L)ogin, (C)ompute, (B)oth |
|---|---|---|---|---|---|---|
| Lustre | Scalable File System | 1.8 | GNU GPL | Lustre Home Page | /opt/lustre | B |
| Maui | Workload Scheduler | GNU Lesser GPL | /opt/maui | L | ||
| xen | N/A | TBD | Generic | xen Home Page TBD | /opt/? | B |
| CentOS | Operating System | 5.3 | Open Source | CentOS Home Page | N/A | B |
| Environment Modules | Environment Variable Management | 3.2.7 | GNU GPL | Environment Modules Home Page | /opt/modules | B |
| Ganglia | N/A | 2.5.7 | Open Source | Ganglia Home Page | /opt/ganglia | B |
| Nagios | N/A | 3.4.3 | Open Source | Nagios Home Page | /opt/nagios | L |
| mvapich2 | Message Passing Interface | 1.9a2 | Open Source | TBD | TBD –or– TBD –or– TBD |
B |
| TORQUE | Resource Manager | 2.3.6 | Open Source | TORQUE Home Page | /opt/torque | B |
| Gold | Allocation Manager | 2.2.0.5 | Open Source | Gold Home Page | /opt/gold | L |
| Package | Topic Area | Version | License Type | Package Home Page | User Install Location | Installed on: (L)ogin, (C)ompute, (B)oth |
Compilers |
||||||
| Package | Topic Area | Version | License Type | Package Home Page | User Install Location | Installed on: (L)ogin, (C)ompute, (B)oth |
|---|---|---|---|---|---|---|
| Java | Compiler | 1.6.0_07 | Generic | Java Home Page | /usr/bin/javac | B |
| PGI Compilers | C and Fortran Compilers | 12.10 | Licensed (flexlm) | PGI Compilers Home Page | /opt/pgi | B |
| Intel Compilers | C and Fortran Compilers | 2013.1.117 | Licensed (flexlm) | Intel Compilers Home Page | /opt/intel | B |
Users can install software in their home directories. User Support will also install software on request in a beta location. If interest is shared with other users, requested installations can become part of the core software repository. Please post your requests to the TSCC Mailing List.
The TSCC condo cost structure is based on condo participants purchasing their nodes, paying a one-time infrastructure fee for their pro rata share of the common networking and storage infrastructure, and then paying a modest annual operating expense that is supplemented by the campus RCI program. Pay-as-you-go hotel users purchase cycles that reflect the total cost-of-ownership, albeit leveraging the economies of scale afforded by TSCC.
For pay-as-you-go users, the cost for the general computing hotel nodes is $0.025 per SU, and the cost for the PDAF high-memory nodes is $0.025/SU but with a 16-core/256GB minimum per job. Hotel purchases carry a minimum purchase of $250.
For condo participants, the primary cost is purchasing the computing nodes ($3934 per node) and a one-time infrastructure fee of $920 per node to cover the costs of shared infrastructure such as interconnects, home file systems and the parallel file system. In addition, there is a modest IDC-bearing operations fee of $495/node/year, which will allow growth in operations and user services support as the size of the cluster increases. (Most of the operating costs for system administration, user support, software licensing, etc. are supplemented by the RCI program.)
Condo purchasers may purchase only general computing nodes at this time. A separate price will be determined for GPU nodes in the coming weeks. The current price structure for the condo expenses per node follows (costs and configurations are subject to change annually). The operations fee is supplemented by the RCI program, and pays for labor, software licensing, administration hardware, and colocation fees. It is anticipated that both the node purchase and one-time infrastructure fee will not bear indirect costs, while the annual operating fee will bear applicable IDC.
The TSCC is available to researchers from other UC campuses, other educational institutions and industry. Costs are competitive but higher than those cited above for UCSD researchers because the UCSD RCI program is supplementing the program. Please contact us for information on the rate structure for your organization.