RACS - Donations

Supermicro Liquid Cooled GPU Workstation

The Supermicro Liquid Cool GPU Workstation is an amazing piece of equipment that helps users overcome limits, reposition processing and move around changes to on premise capabilities. At the same time this hardware gives the world a peek at what proper liquid cooling can do for performance around the hardware technologies we are already purchasing. The Research and Academic Computing Service (RACS) group here in the College of Earth, Oceans and Atmospheric Sciences (CEOAS) has worked closely with Mark III systems, NVIDIA and Supermicro to test this new hardware in real world research work to see how it can hold up to processing big data around managing our world and climate change. 

Through the testing it became clear to the RACS group the liquid cooled workstation hit many major areas of need for our research work. 

  • Edge Computing was a major reason for the testing of this hardware and the RACS group worked with the US Forest Service (USFS) to test edge processing in the North West forest here in the United States. The full blog can be found on the Mark III Systems page under "Liquid Cooled Workstation for Large-Scale Edge Data Science". This machine allowed for the USFS to train and run large models at the edge reducing the data ingest and increasing visibility.
  • Groups using the cloud have reduce or removed the on premise server rooms and infrastructure making it hard to bring back high end technologies to save on cloud costs. GPU machines have high costs or limited availability currently on cloud providers where having a machine that uses standard power along with cooling needs found in normal office space allows groups to bring back workloads to on premise without the need for infrastructure. We found the liquid cooled machine could train the same larger models while process larger amounts of data over machines with a much greater costs and were less configurable. 
  • Cost to performance - Low cost with high performance similar to a DGX type system but better cost model for development (2+2 NVLink). We have published a blog with Mark III Systems about the increased performance when using liquid cooling entitled, “Liquid Cooled Workstation: Advantageous for Big Ocean Simulations, Exceptional for Small Tasks
  • Tools that need both graphical and non-graphical interaction were amazing on the liquid cooled workstation. Many times researchers are using tools like Jupyter or RStudio that use a semi-graphical interaction layer to enable non-graphical processing. This machine acted as an amazing machine for data analysis that included visual output like plots or dynamic simulations because it has a graphical output directly on the system.

This system provide our group a with a crucial look at liquid cooling since we have used the standard air cooling in our server rooms and computing hardware for as long as we have done computational research. The liquid cooling would be an important step forward for the future of computational science that utilizes processing and accelerators over long periods of time to answer scientific questions. In general the liquid cooling for all future server farms and infrastructures will be needed to increase performance within the same physical spaces while returning potential heat that can be used for other energy purposes. We see Supermicro as a leader in the liquid cooling space as they have a very large portfolio of liquid cooled servers and rack solutions and this will be important as we move to the new NVIDIA Grace Blackwell technology requiring liquid cooling.