Lamini Rate Limits
The Lamini Platform meters inference and tuning usage per account and is governed primarily by available credit / spend on the On-Demand tier and by reserved GPU capacity on Enterprise. Concurrent inference requests and tuning jobs are bounded by account capacity rather than fixed published per-minute request quotas. Specific numeric limits are not publicly documented and are not reconciled in this artifact.
Lamini Rate Limits is the machine-readable rate-limit profile for Lamini on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 4 rate-limit definitions, measuring requests, jobs, steps, and usd.
The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.
Tagged areas include AI, LLM, Fine-Tuning, Memory Tuning, and Inference.