Microsofts new H200 v5 series VMs for Azure aim to supercharge GPU performance
Date:
Mon, 07 Oct 2024 19:26:00 +0000
Description:
New VM series drastically improves performance and cost efficiency for LLM inferencing
FULL STORY ======================================================================
Microsoft has announced the launch of new Azure virtual machines (VMs) aimed specifically at ramping up cloud-based AI supercomputing capabilities.
The new H200 v5 series VMs are now generally available for Azure customers
and will enable enterprises to contend with increasingly cumbersome AI workload demands.
Harnessing the new VM series, users can supercharge foundation model training and inferencing capabilities, the tech giant revealed. Scale, efficiency and performance
In a blog post , Microsoft said the new VM series is already being put to use by a raft of customers and partners to drive AI capabilities.
The scale, efficiency, and enhanced performance of our ND H200 v5 VMs are already driving adoption from customers and Microsoft AI services, such as Azure Machine Learning and Azure OpenAI Service, the company said.
Among these is OpenAI, according to Trevor Cai, OpenAIs head of infrastructure, which is harnessing the new VM series to drive research and development and fine-tune ChatGPT for users.
Were excited to adopt Azures new H200 VMs, he said. Weve seen that H200
offers improved performance with minimal porting effort, we are looking forward to using these VMs to accelerate our research, improve the ChatGPT experience, and further our mission. Under the hood of the H200 v5 series
Azure H200 v5 VMS are architected with Microsofts systems approach to enhance efficiency and performance, the company said, and include eight Nvidia H200 Tensor Core GPUs.
Microsoft said this addresses a growing gap for enterprise users with regard to compute power.
With GPUs growing in raw computational capabilities at a faster rate than attached memory and memory bandwidth, this has created a bottleneck for AI inferencing and model training, the tech giant said.
The Azure ND H200 v5 series VMs deliver a 76% increase in High Bandwidth Memory (HBM) to 141GB and a 43% increase in HBM Bandwidth to 4.8 TB/s over
the previous generation of Azure ND H100 v5 VMs, Microsoft said in its announcement.
This increase in HBM bandwidth enables GPUs to access model parameters
faster, helping reduce overall application latency, which is a critical
metric for real-time applications such as interactive agents.
Additionally, the new VM series can also compensate for more complex large language models (LLMs) within the memory of a single machine, the company said. This thereby improves performance and enables users to avoid costly overheads when running distributed applications over multiple VMs.
Better management of GPU memory for model weights and batch sizes are also a key differentiator for the new VM series, Microsoft believes.
Current GPU memory limitations all have a direct impact on throughput and latency for LLM-based inference workloads, and create additional costs for enterprises.
By drawing upon a larger HBM capacity, the H200 v5 VMs are capable of supporting larger batch sizes, which Microsoft said drastically improves GPU utilization and throughput compared to previous iterations.
In early tests, we observed up to 35% throughput increase with ND H200 v5 VMs compared to the ND H100 v5 series for inference workloads running the LLAMA 3.1 405B model (with world size 8, input length 128, output length 8, and maximum batch sizes 32 for H100 and 96 for H200), the company said. More
from TechRadar Pro AWS CEO: No need for massive shifts - but AI could still bring changes AWS is making it easier for start-ups to use its cloud services Weve rounded up a list of all the best cloud hosting providers
======================================================================
Link to news story:
https://www.techradar.com/pro/microsofts-new-h200-v5-series-vms-for-azure-aim- to-supercharge-gpu-performance
--- Mystic BBS v1.12 A47 (Linux/64)
* Origin: tqwNet Technology News (1337:1/100)