Amazon SageMaker HyperPod expands support to G7e and r5d.16xlarge instances

Amazon SageMaker HyperPod now supports G7e and r5d.16xlarge instances, enhancing its infrastructure for large-scale AI/ML models. G7e instances offer superior inference performance, while r5d.16xlarge is ideal for memory-intensive tasks.

Amazon SageMaker HyperPod now incorporates support for G7e and r5d.16xlarge instances. Designed specifically for the development, training, and deployment of large-scale foundation models, SageMaker HyperPod offers a robust and efficient environment. It features built-in fault tolerance, automated cluster recovery, and optimized distributed training libraries, which significantly reduce the complexity of managing extensive AI/ML infrastructure.

The G7e instances, equipped with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, deliver up to 2.3 times better inference performance compared to the previous G6e instances. This enhancement allows for processing a higher number of requests per second while minimizing latency. With a total GPU memory capacity of up to 768 GB, these instances enable the deployment of larger language models or the operation of multiple models on a single endpoint. G7e instances are ideal for deploying large language models (LLMs), agentic AI, multimodal generative AI, and physical AI models. Additionally, they are cost-effective for single-node fine-tuning or training of NLP, computer vision, and smaller generative AI models, offering up to 1.27 times the TFLOPs and up to four times the GPU-to-GPU bandwidth compared to G6e instances.

Furthermore, SageMaker HyperPod now supports the r5d.16xlarge instance. This instance features 64 vCPUs, 512 GB of memory, and 5 x 600 GB NVMe SSD instance storage, powered by Intel Xeon Platinum 8000 series processors with a sustained all-core turbo frequency of up to 3.1 GHz. This configuration is particularly suitable for distributed training data preprocessing, especially with frameworks such as Ray, large-scale feature engineering, and managing memory-intensive orchestration services alongside GPU compute.

The G7e instances are available in regions including US East (N. Virginia), US East (Ohio), Asia Pacific (Tokyo), and US West (Oregon). Meanwhile, the r5d.16xlarge instances are accessible in all regions where Amazon SageMaker HyperPod operates.