SageMaker Training Plans can now extend existing capacity commitments

Amazon SageMaker AI now allows users to extend their GPU capacity commitments for Training Plans, ensuring uninterrupted access for prolonged AI workloads without reconfiguration.

Amazon SageMaker AI has announced a new feature for its Training Plans, allowing users to extend their reserved GPU capacity commitments without needing to reconfigure their workloads. This update is particularly useful for AI workloads that require more time than originally anticipated, ensuring seamless access to the necessary resources.

SageMaker Training Plans enable users to reserve GPU capacity in clusters of up to 64 instances for specific time periods. With the new extension capability, users can now prolong their plans in 1-day increments up to 14 days, or in 7-day increments for up to 182 days (equivalent to 26 weeks). These extensions can be executed through the SageMaker console or via API, providing flexibility and convenience.

Once an extension is purchased, the AI workload continues to run smoothly without any need for manual reconfiguration. This feature ensures that projects remain uninterrupted, aligning with SageMaker AI’s goal of delivering cost-effective training plans that fit within users’ timelines and budgets.

After creating and purchasing a training plan, SageMaker automatically allocates the necessary infrastructure and executes the AI workloads on the allocated compute resources, eliminating the need for manual intervention. For detailed information on instance availability across different AWS Regions, users are encouraged to refer to the SageMaker AI pricing page.

For more details on how to extend training plans, users can consult the Amazon SageMaker Training Plans User Guide.