AI Gateway for AI Agents and LLMs
Discover how Apache APISIX serves as an AI gateway with AI proxy, LLMs load balancing, retry and fallback, token rate limiting, and security for efficient and reliable AI agents.

Transform APISIX into an AI Gateway with AI Plugins
Read the DocsManage API and AI Traffic in One Gateway
To Keep Up with the Rapid Evolution of AI and LLMs
No Vendor Lock-in
Powered by Apache APISIX
100+
LLMs and API management features
Powerful and Open-Source Plugins for LLMs Load Balancing and Token Rate Limiting
All AI plugins are fully open-source, including multi-LLM load balancing, retry and fallback mechanisms, token rate limiting, content moderation, AI RAG, prompt decorator and auditing.


Multi-LLM Load Balancing
Supports multiple LLM providers (OpenAI, DeepSeek, Claude, Mistral, Gemini, etc.) to prevent vendor lock-in, while dynamically adjusting LLM weights based on latency, cost, and stability.
Token Rate Limiting
Token usage can be rate-limited and throttled based on various dimensions such as Route, Service, Consumer, Consumer Group, or custom parameters. Supports both single-node and cluster-level rate limiting. Additionally, different rate-limiting strategies can be configured for each LLM.
AI RAG
Through RAG, LLMs can leverage the enterprise knowledge base to answer questions or generate content, improving the professionalism and accuracy of the generated output while avoiding LLM hallucinations.
Observability of Token Usage
By utilizing access logs and observability components, track token usage to prevent API abuse and avoid excessive billing.
Retry and Fallback
Supports configurable LLM health checks, with automatic retries and fallback to other LLM services, ensuring service stability and quality.
Security
Utilize plugins such as Prompt Guard, Prompt Decorator, Prompt Template, Content Moderation, and Logging & Auditing to ensure the security and compliance of user inputs and LLM responses.
Multiple LLM providers
APISIX AI Gateway supports multiple LLMs, including but not limited to OpenAI, DeepSeek, Claude, Mistral, and Gemini, ensuring your AI applications are adaptable to diverse scenarios.
Learn More