The way forward for CPU and GPU based servers
Server chips are once again hot news with Nvidia paying around $40bn for ARM and access to its extensive developer community. Fears that the move to 7nm production has stalled appears not to have diminished chip and server companies relentless rollout of their latest performance systems. In pursuit of higher compute, chip wattage is rising and within the cabinets things are getting hotter.
Thermal requirements are expanding in tightly packed data centres, whether traditional warehouses style sites or at the Edge. It is becoming clear to many compute dense communities, such as fintech and research and development that current air-based cooling techniques in technology spaces cannot handle the thermal capacity required to maintain an effective environment. With its ability to provide direct to chip chassis-level liquid cooling is becoming the leading viable option.
Architecture advances for server-side platforms such as intel’s Purley, along with its ‘Ice Lake’ derivative code named ‘Whitely,’ and AMD’s Zen 2 platform, make keeping up with the cadence of X86 based SKUs a challenge.
The growing requirement for parallel processing architectures has expanded the market opportunities for smaller but fast growing GPU developers. There is no longer just one battle being fought between two chip titans - Intel and AMD - to keep up with. GPU-based servers are now taking the fight to the dominant architecture with the promise of a single GPU server delivering all the processing power of multiple CPU-based units. Nvidia is the recognised market leader and together with its ARM acquisition the market is going to get more intense.
The GPU-based server market is growing. According to the Semiconductor Industry Association (SIA) (www.semiconductors.org), R&D spending reached $40bn in 2019. Of the $117bn spent last year, use in computers represents just 28.4% of the total global semiconductor market.
The stakes are clearly high and there exists a never-ending demand for faster processing in high performance computing, AI, IoT, and Edge. These uses are no longer exceptional, they are becoming the norm. As they move into the centre ground, they are certain to drive up power densities from the chip to the rack.
As was reported by nextplatform.com: “Over the past decade, server CPUs have been getting hotter and hotter and top-bin parts running full bore will soon be as searing as a GPU or FPGA accelerator.”
Today, the wattage range on server microprocessors is somewhere between 165 watts to 225 watts. However, some already forecast this is going to double in the next few years. And when considering what all this power means for rack density in data centres, today’s 20kW racks will triple, with HPC racks pulling 60kWs. Some think even this is an underestimation.
More watts mean more heat to be removed; and removing that heat is now reaching an inflection point. Are air-cooled environments an effective and efficient solution to the increasingly powerful servers that will be implemented in more mainstream applications in order to support the increasing global data traffic requirements? Blowing cool air onto a processor drawing 400 watts is simply not going to do the job. Those familiar with data centre layout are quick to see the hazard of hot spots and heat spikes emerging, quickly followed by server performance diminishing or in the worst-case scenario, tripping the server power and causing an outage. According to the Uptime Institute’s latest survey (https://uptimeinstitute.com/2020-data-center-industry-survey-results) outages are getting bigger and more frequent.
Additional server fans or even fans trays within the cabinet divert power from the processors to cooling and in performance critical systems may reduce the number of servers available in the rack, in order to maintain the cooling capacity of the room.
A more efficient and effective cooling solution is needed. Liquid cooling is the compelling option. Chassis level liquid cooling Liquid for GPU servers cuts cooling energy at high densities. Iceotope has demonstrated how a high-powered 3U GPU server is cooled via chassis-based, immersive liquid cooling. The system is configured with: (2) Cascade Lake 8276 Xeon processors 28 cores each; 128GB 2933 ECC RDIMM and (6) V100 Nvidia Tesla GPU’s. The chassis-based immersive cooling allows for near silent operation with over 95% of the heat being removed via fluid.
Iceotope patented solutions are leading the advances in precision delivery of chassis-level liquid cooling for all server types. They improve data centre efficiency, improving PUE by making a higher proportion of data centre energy available to the computing load. And by reducing cooling energy consumption and using less water than traditional solutions, they make a major stride to more environmentally friendly operations.
Today’s hot potato CPU and GPUs have already pushed air cooling to its absolute limit. The future of cool servers and racks is liquid.