Amazon SageMaker inference endpoints now compatible with OpenAI APIs
Amazon SageMaker Inference now supports APIs compatible with OpenAI, allowing seamless integration with existing tools by simply changing an endpoint URL. This feature is available in multiple global regions.
Amazon SageMaker Inference has introduced support for APIs compatible with OpenAI, enabling users to seamlessly connect to SageMaker endpoints using familiar tools and frameworks such as the OpenAI SDK, LangChain, and Strands Agents. This integration is straightforward, requiring only a change in the endpoint URL without the need for custom integration code, SDK wrappers, or rewrites.
With this update, users do not need to modify their API format or authentication methods. By simply updating the endpoint URL, existing SDK calls, streaming processes, and framework integrations will continue to function without interruption. This enhancement allows users to select their preferred GPU instances, maintain data within their own Virtual Private Cloud (VPC), operate any open-source or fine-tuned model, and utilize auto-scaling policies tailored to their specific workloads.
Authentication is efficiently managed through existing AWS credentials, which include automatic token refresh, minimizing additional management tasks in production environments. This feature is currently available in several regions, including US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Mumbai), Asia Pacific (Jakarta), Europe (Ireland), Europe (Frankfurt), South America (São Paulo), Asia Pacific (Tokyo), Asia Pacific (Seoul), Europe (London), Asia Pacific (Singapore), Asia Pacific (Sydney), and Canada (Central).
For further details and to begin utilizing this capability, interested users can consult the launch blog or explore the SageMaker Inference documentation.