Amazon Sagemaker Rate Limits
Amazon SageMaker exposes a control-plane API (CreateTrainingJob, CreateEndpoint, etc.) that follows AWS API throttling per account/region, plus a runtime InvokeEndpoint surface whose throughput scales with the underlying instance count and instance type. Endpoint-specific quotas (concurrent invocations, payload size, timeout) are configurable. ServiceQuotas governs the maximum number and type of instances per account.
Amazon Sagemaker Rate Limits is the machine-readable rate-limit profile for Amazon SageMaker on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 5 rate-limit definitions, measuring varies, requests_per_second, bytes, seconds, and count.
The profile also includes 4 backoff/retry policies defined and response codes documented for throttled, quotaExceeded, and serviceUnavailable.
Tagged areas include Rate Limiting, Machine Learning, and SageMaker.