gauge-maxRate Limits

Per-endpoint rate limits, response headers, and retry guidance for both APIs.

Every Amigo API endpoint enforces a per-organization rate limit. When you exceed the limit, the API returns HTTP 429 Too Many Requests with a Retry-After header indicating how many seconds to wait before retrying.

circle-info

Both APIs - Rate limits apply to both Classic API and Platform API endpoints.

Overview

Header
Description

X-RateLimit-Limit

Maximum requests allowed in the current window

X-RateLimit-Remaining

Requests remaining in the current window

X-RateLimit-Reset

UTC epoch seconds when the window resets

Retry-After

Seconds to wait before retrying (present on 429 responses)

circle-exclamation

Rate Limits by Endpoint

Organization

Endpoint
Method
Rate Limit

/v1/{org}/organization/

PUT

5/min

/v1/{org}/organization/

POST

10/min

/v1/{org}/organization/

DELETE

5/min

/v1/{org}/organization/user_dimensions/

GET

20/min

Agent

Endpoint
Method
Rate Limit

/v1/{org}/organization/agent

POST

10/min

/v1/{org}/organization/agent

GET

20/min

/v1/{org}/organization/agent/{agent_id}/

POST

10/min

/v1/{org}/organization/agent/{agent_id}/

DELETE

10/min

/v1/{org}/organization/agent/{agent_id}/version

GET

20/min

API Keys

Endpoint
Method
Rate Limit

/v1/{org}/organization/api_key/

POST

5/min

/v1/{org}/organization/api_key/

GET

10/min

/v1/{org}/organization/api_key/{api_key_id}/

DELETE

10/min

Context Graphs (Service Hierarchical State Machines)

Endpoint
Method
Rate Limit

/v1/{org}/organization/service_hierarchical_state_machine

POST

10/min

/v1/{org}/organization/service_hierarchical_state_machine

GET

20/min

/v1/{org}/organization/service_hierarchical_state_machine/{id}/

POST

10/min

/v1/{org}/organization/service_hierarchical_state_machine/{id}/version

GET

20/min

/v1/{org}/organization/service_hierarchical_state_machine/{id}/

DELETE

10/min

Users

Endpoint
Method
Rate Limit

/v1/{org}/user/

POST

50/min

/v1/{org}/user/

GET

60/min

/v1/{org}/user/search/

GET

50/min

/v1/{org}/user/signin_with_api_key

POST

5/min

/v1/{org}/user/{requested_user_id}

POST

50/min

/v1/{org}/user/{requested_user_id}

DELETE

500/min

/v1/{org}/user/{requested_user_id}/variable

POST

50/min

/v1/{org}/user/{user_id}/memory

GET

40/min

/v1/{org}/user/{user_id}/user_model

GET

60/min

Services

Endpoint
Method
Rate Limit

/v1/{org}/service/

GET

50/min

/v1/{org}/service/

POST

20/min

/v1/{org}/service/{service_id}/

POST

20/min

/v1/{org}/service/{service_id}/version_sets/{version_set_name}/

PUT

30/min

/v1/{org}/service/{service_id}/version_sets/{version_set_name}/

DELETE

30/min

Conversations

Endpoint
Method
Rate Limit

/v1/{org}/conversation/

POST

5/min

/v1/{org}/conversation/

GET

15/min

/v1/{org}/conversation/conversation_starter

POST

10/min

/v1/{org}/conversation/{conversation_id}/finish/

POST

5/min

/v1/{org}/conversation/{conversation_id}/interact

POST

15/min

/v1/{org}/conversation/{conversation_id}/interaction/{interaction_id}/insights

GET

20/min

/v1/{org}/conversation/{conversation_id}/interaction/{interaction_id}/recommend_responses

POST

20/min

/v1/{org}/conversation/{conversation_id}/messages/

GET

20/min

/v1/{org}/conversation/{conversation_id}/messages/{message_id}/source

GET

30/min

/v1/{org}/conversation/{conversation_id}/tags/

POST

50/min

Tools (Actions)

Endpoint
Method
Rate Limit

/v1/{org}/tool/

POST

20/min

/v1/{org}/tool/

GET

50/min

/v1/{org}/tool/amigo_tool_scaffold.tar.gz

GET

50/min

/v1/{org}/tool/invocation

GET

50/min

/v1/{org}/tool/invocation/search

GET

50/min

/v1/{org}/tool/test

POST

1/min

/v1/{org}/tool/{tool_id}

POST

20/min

/v1/{org}/tool/{tool_id}

DELETE

20/min

/v1/{org}/tool/{tool_id}/envvar

POST

20/min

/v1/{org}/tool/{tool_id}/version

POST

10/min

/v1/{org}/tool/{tool_id}/version

GET

50/min

/v1/{org}/tool/{tool_id}/version/{versions}

DELETE

20/min

/v1/{org}/tool/{tool_id}/version/{version}/invoke

POST

10/min

Dynamic Behaviors

Endpoint
Method
Rate Limit

/v1/{org}/dynamic_behavior_set/

POST

200/min

/v1/{org}/dynamic_behavior_set/

GET

500/min

/v1/{org}/dynamic_behavior_set/search

GET

50/min

/v1/{org}/dynamic_behavior_set/{id}/

POST

500/min

/v1/{org}/dynamic_behavior_set/{id}/

DELETE

500/min

/v1/{org}/dynamic_behavior_set/{id}/invocation/

GET

50/min

/v1/{org}/dynamic_behavior_set/{id}/version/

POST

200/min

/v1/{org}/dynamic_behavior_set/{id}/version/

GET

1000/min

Metrics

Endpoint
Method
Rate Limit

/v1/{org}/metric/

POST

100/min

/v1/{org}/metric/

GET

20/min

/v1/{org}/metric/evaluate

POST

50/min

/v1/{org}/metric/metric_evaluation_result

GET

10/min

/v1/{org}/metric/search/

GET

20/min

/v1/{org}/metric/{metric_id}/

POST

20/min

/v1/{org}/metric/{metric_id}/

DELETE

100/min

Roles & Permissions

Endpoint
Method
Rate Limit

/v1/{org}/role/

POST

20/min

/v1/{org}/role/

GET

20/min

/v1/{org}/role/temporary_permission_grant/

POST

100/min

/v1/{org}/role/temporary_permission_grants/

GET

100/min

/v1/{org}/role/{role_name}

POST

10/min

/v1/{org}/role/{role_name}/assign

POST

1000/min

Simulations

Personas

Endpoint
Method
Rate Limit

/v1/{org}/simulation/persona/

POST

100/min

/v1/{org}/simulation/persona/

GET

50/min

/v1/{org}/simulation/persona/search

GET

50/min

/v1/{org}/simulation/persona/{id}/

POST

100/min

/v1/{org}/simulation/persona/{id}/

DELETE

100/min

/v1/{org}/simulation/persona/{id}/version/

POST

100/min

/v1/{org}/simulation/persona/{id}/version/

GET

50/min

Scenarios

Endpoint
Method
Rate Limit

/v1/{org}/simulation/scenario/

POST

500/min

/v1/{org}/simulation/scenario/

GET

200/min

/v1/{org}/simulation/scenario/search

GET

50/min

/v1/{org}/simulation/scenario/{id}/

POST

500/min

/v1/{org}/simulation/scenario/{id}/

DELETE

500/min

/v1/{org}/simulation/scenario/{id}/version/

POST

500/min

/v1/{org}/simulation/scenario/{id}/version/

GET

500/min

Unit Tests

Endpoint
Method
Rate Limit

/v1/{org}/simulation/unit_test/

POST

500/min

/v1/{org}/simulation/unit_test/

GET

200/min

/v1/{org}/simulation/unit_test/search/

GET

50/min

/v1/{org}/simulation/unit_test/{id}/

POST

500/min

/v1/{org}/simulation/unit_test/{id}/

DELETE

500/min

Unit Test Sets & Runs

Endpoint
Method
Rate Limit

/v1/{org}/simulation/unit_test_set/

POST

50/min

/v1/{org}/simulation/unit_test_set/

GET

50/min

/v1/{org}/simulation/unit_test_set/search/

GET

50/min

/v1/{org}/simulation/unit_test_set/{id}

POST

50/min

/v1/{org}/simulation/unit_test_set/{id}

DELETE

50/min

/v1/{org}/simulation/unit_test_set_run/

POST

20/min

/v1/{org}/simulation/unit_test_set_run/

GET

50/min

/v1/{org}/simulation/unit_test_set_run/{id}/

DELETE

10/min

/v1/{org}/simulation/unit_test_set_run/{id}/artifacts/

GET

100/min

Webhook Destinations

Endpoint
Method
Rate Limit

/v1/{org}/webhook_destination/

POST

20/min

/v1/{org}/webhook_destination/

GET

20/min

/v1/{org}/webhook_destination/{id}

POST

20/min

/v1/{org}/webhook_destination/{id}

DELETE

20/min

/v1/{org}/webhook_destination/{id}/delivery

GET

5/min

/v1/{org}/webhook_destination/{id}/rotate-secret

POST

20/min

Admin

Endpoint
Method
Rate Limit

/v1/{org}/admin/get_models/

GET

20/min

/v1/{org}/admin/get_prompt_logs/

GET

20/min

/v1/{org}/admin/sql_query

POST

6/min

Handling Rate Limits

When your application receives a 429 response, use the Retry-After header to determine how long to wait. Combine this with exponential backoff for resilience.

Manual Implementation

SDK Automatic Retry Behavior

Both the Python SDK (amigo-python-sdk) and TypeScript SDK (@amigo-ai/sdk) handle 429 responses automatically:

  • The SDKs read the Retry-After header and wait the specified duration before retrying.

  • Retries use exponential backoff for consecutive 429 responses.

  • Network errors and 5xx server errors are also retried automatically.

circle-check

Best Practices

  1. Respect Retry-After headers. Always use the server-provided delay rather than a fixed wait time.

  2. Use exponential backoff. If the Retry-After header is absent, double the delay on each consecutive 429 (capped at 60 seconds).

  3. Monitor X-RateLimit-Remaining. Proactively slow down requests as you approach the limit rather than waiting for a 429.

  4. Batch where possible. Some endpoints (such as tool invocation) accept multiple items in a single request, reducing the number of calls.

  5. Spread requests over time. Avoid bursting all requests at the start of a rate-limit window.

  6. Use the SDKs. The official Python and TypeScript SDKs handle 429 retries automatically with proper backoff.

  7. Design for the tightest limit. Conversation creation and interaction are limited to 5 and 15 requests/min respectively. Architect your application to stay comfortably within these bounds.

Last updated

Was this helpful?