HiPerGator


The University of Florida supercomputer is a cluster that includes the latest generation of processors and offers nodes for memory-intensive computation. HiPerGator’s high-performance storage systems can be accessed from diverse interfaces, including Globus, UFApps for Research, and other tools.

UFIT Research Computing maintains the cluster and its many parts, allowing researchers to focus on their research instead of hardware and software maintenance. UFIT Research Computing supports supports a significant number of widely-used applications. Our staff is happy to evaluate and explore additional applications for UF’s research needs.

OVERVIEW

HiPerGator can be used by UF faculty and by faculty at colleges and universities in Florida for teaching and research using these options and procedures:

  1. For teaching a class, allocations are free and they last one semester. See classroom support for detailed instructions.
  2. For research, allocations can be purchased for periods ranging from three months to several years. The rates are listed on our price sheets (section under review, link once built)
  3. A three month trial allocation at no cost may be requested for developing a course in advance of teaching the course and to explore the use of HPC for research. File a trial allocation request After the trial ends, please work with UFIT Research Computing staff to find the best way forward for continuing use of HiPerGator.
  4. To learn about HiPerGator capabilities, colleges and departments can request a free 3-month trial allocation shared between all faculty in the unit to get easy access for learning about HPC and preparing to include HPC in their courses at no cost to individual faculty.  

Note that the HiPerGator operation and infrastructure has been operating successfully on this model since 2013 with significant investment from the provost, the VP for research, and the CIO. 

HiPerGator Configuration

Phase Year Cores

RAM/core

HiPerGator AI June 2025

63 NVIDIA DGX B200
180 GB RAM per GPU

18GB

HiPerGator 4.0 June 2025

19,200 AMD EPYC 9655P Turin
2.5 GHz (4.5 GHz Boost) Cores
600 NVIDIA L4 GPGPU

8GB

HiPerGator 3.0 Jan 2021 30,720 AMD EPYC 7702 Rome
2.0 GHz Cores

8GB

HiPerGator 3.0 May 2021 9,600 AMD EPYC 75F3 Milan
3.0 GHz Cores

8GB

Total of 70,320 cores

  • HiPerGator 3.0 has:
    • 608 NVIDIA RTX 2080TI and RTX 6000 GPU’s
    • 4 Petabytes (PB) of Blue fast storage

HiPerGator AI NVIDIA DGX A100 SuperPod

Cluster Information

  • 140 NVIDIA DGX A100 nodes
  • 17,920 AMD Rome cores
  • 1,120 NVIDIA Ampere A100 GPUs
  • 2.5 PB All-Flash storage
  • Over 200 HDR Infiniband and various Ethernet switches for connectivity
  • Double precision LinPack (HPL): 17.2 Petaflops
    • TOP500 June 2021: Ranked #22
    • Green500 June 2021: Ranked #2
  • AI Floating Point Operations: 0.7 Exaflops

Node Information

  • 2x AMD Epy 7742 (Rome) 64-Core processors with Simultaneous Multi-Threading (SMT) enabled, presenting 256 cores per node
  • 2TB RAM
  • 8x NVIDIA A100 80GB Tensor Core GPUs
  • NVSWITCH technology that supports integration of all 8 A100 GPU’s in a system with unified memory space
  • 10x HDR IB non-blocking interfaces for inter-node communication
  • 2x 100GbE ethernet interfaces
  • 28TB NVMe local storage

The latest NVIDIA GPU technology of the Ampere A100 GPU has arrived at UF in the form of an NVIDIA SuperPod. UF is the first university in the world to get to work with this technology. Visit the UF Artificial Intelligence Initiative website for more information.

The A100 technical specifications can be found at the NVIDIA A100 Website, in the NVIDIA DGX A100 and at the NVIDIA Ampere developer blog.

For A100 benchmarking results, please see the HPCWire report.

June 2025 Update

UF received the first delivery of a DGX B200 SU in the world!

NVIDIA DGX B200 SuperPOD SU1

The full NVIDIA DGX B200 SuperPOD will consist of two scalable units (SUs). SU1 was delivered at the end of January and has undergone extensive testing since the installation was completed in early April. UFIT Research Computing is finalizing benchmarks and ensuring the system is ready for users.

The current plan is for SU1, including 31 servers and 248 B200 GPUs, to come online for users sometime next week. However, until the system is in full production, the B200 servers will run under an “early access” service level.

  • Maintenance on servers will be more frequent. While staff will try to avoid stopping running jobs, occasionally taking some or all servers offline may be necessary. We will not be able to announce all maintenance in advance.
  • Slurm scheduler settings for GPU use and priorities may change between early access and the final production state.

New CPU/GPU servers

Another part of HiPerGator 4th Gen is the replacement of HiPerGator 2 servers and 2080 Ti and RTX 6000 GPUs after many years of service. Our vendor, Lenovo, has delivered our new system with 19,200 cores and 600 NVIDIA L4 GPUs. UFIT Research Computing staff are getting these servers connected, burned in, and ready for users. These resources will deliver refreshed CPU resources and a much more capable GPU to take on many common AI workloads and provide graphics capabilities. The current plan is for these servers to come online for users by the end of June.  

Decommissioning the A100 and 2080 Ti servers    

Part of the cost of the HiPerGator 4th Gen upgrades was offset by trading in the older hardware. We are working with our vendors to coordinate dates that align with bringing new hardware online with minimal disruption of service availability. However, depending on the dates vendors pick up the components, we may have some periods with limited GPU availability.

The current plan is for the 2080 Ti and RTX 6000 GPUs to remain in production until the L4 GPUs come online towards the end of June. The vendors may want these sooner, reducing GPU availability until we can get the L4s into production. The remaining A100s will be removed from service on June 24, making space for the delivery of SU2 of the NVIDIA DGX B200 SuperPOD.

NVIDIA DGX B200 SuperPOD SU2

UFIT Research Computing anticipates the delivery of the second SU in early July. After installation, the system will be validated and hopefully turned over to our staff by the end of July. In August, we will need to combine and benchmark the full system with both SUs. That will require all B200 servers to be removed from user access, leaving only the L4s accessible to users during much of August.

Blue Storage Replacement

Our storage vendor, DDN, is finalizing the configuration of the new all-flash 11 PB Blue storage system that will replace the current 7.2 PB Blue storage. Once received, UFIT Research Computing staff will update users with more details of the transition plan to migrate data to the new system while minimizing interruption of system availability.

HiPerGator 4th Gen Production

We are still on track for an early September readiness of the full HiPerGator 4th Gen with approximately 60,000 cores, 600 L4 GPUs, 504 B200 GPUs and an 11PB all-flash, high-performance, parallel file system. This system will continue to cement UF’s status as a leader in high-performance computing and AI capabilities.

March 2025 update

The UF Information Technology (UFIT) Data Center has been an especially busy place for the past few months! On Jan. 22, two semi-trucks made it through the snow and ice that blanketed the South to deliver the first NVIDIA DGX B200 servers anywhere to the University of Florida. Since then, teams from UFIT, Mark III, and NVIDIA have been working to install and certify the first of two NVIDIA DGX B200 SuperPOD scalable units (SUs). This is the first of several steps before the system can be put into production.

UFIT expects Lenovo to deliver a 19,000-core system with 600 NVIDIA L4 GPUs in April to replace 30,000 CPU cores from HiPerGator 2nd Gen (2015) and 600 NVIDIA 2080ti GPUs from 2019. DataDirect Networks (DDN) will also deliver a new storage system to replace the Blue storage with 11 PB of all-flash storage. UFIT Research Computing staff will continue to install the SuperPOD and configure the new CPU cores, GPUs, and Blue storage through July.

Dr. Deumens installs an NVIDIA B200 SystemUFIT estimates that users will begin to have some access to the new systems in July. There will be times when the systems will not be available due to maintenance and full-system testing. Around that time, the remaining A100 DGX servers will be removed to make room for the second DGX B200 SuperPOD SU delivery and installation.

As the new Blue storage becomes ready for production, UFIT will migrate data from the current Blue storage to the new system, possibly requiring some periods of downtime in July and August.

Our goal is to have the second SU of the B200 SuperPOD ready for production by the end of August. At that point, UFIT will need to use the whole system to run benchmarks and ensure all components are ready for production.

By September 2025, HiPerGator 4th Gen is planned to be ready for production with approximately 60,000 cores, 600 L4 GPUs, 504 B200 GPUs, and an 11 PB all-flash high-performance parallel file system. This system will continue to cement UF’s status as a leader in high-performance computing and AI capabilities.