Guide

Welcome to the comprehensive Apex AI Proxy guide! This documentation will help you master all aspects of deploying, configuring, and using your personal AI gateway.

Overview

Apex AI Proxy is a powerful solution that aggregates multiple AI service providers behind a unified OpenAI-compatible API. It runs on Cloudflare Workers and helps you overcome rate limits, maximize free quotas, and simplify your AI integrations.

Getting Started

New to Apex AI Proxy?

If you're just getting started, we recommend following this learning path:

Quick Start - Get up and running in 60 seconds
Installation - Detailed installation instructions
Configuration - Learn how to configure providers and models
Basic Usage - Make your first API calls

Core Concepts

Understanding these concepts will help you get the most out of Apex AI Proxy:

🔄 Provider Aggregation

Apex AI Proxy acts as a smart router that distributes requests across multiple AI providers. This allows you to:

Overcome individual provider rate limits
Take advantage of multiple free tiers
Ensure high availability through automatic failover

🔑 Unified Authentication

Instead of managing multiple API keys in your applications, you configure them once in the proxy and use a single service key for all requests.

🤖 OpenAI Compatibility

All interactions use the familiar OpenAI API format, making integration seamless with existing codebases and tools.

Architecture

mermaid

graph TD
    A[Your Application] --> B[Apex AI Proxy]
    B --> C[Azure OpenAI]
    B --> D[DeepSeek]
    B --> E[Aliyun DashScope]
    B --> F[DeepInfra]
    B --> G[Custom Provider]
    
    subgraph "Cloudflare Workers"
        B
    end
    
    subgraph "AI Providers"
        C
        D
        E
        F
        G
    end

Key Features

🆓 Completely Free

Runs on Cloudflare Workers' free tier
No server maintenance required
No hidden costs or subscriptions

🔄 Smart Load Balancing

Automatic request distribution
Failover on provider errors
Round-robin and weighted strategies

💰 Cost Optimization

Maximize free quotas across providers
Intelligent provider selection
Usage tracking and optimization

🛡️ Robust Error Handling

Graceful failover between providers
Comprehensive error mapping
Retry logic with exponential backoff

🌐 Multi-Protocol Support

OpenAI Chat Completions API
OpenAI Embeddings API
Anthropic Messages API
Next-gen /v1/responses API

Quick Reference

Essential Commands

bash

# Install dependencies
pnpm install

# Start development server
pnpm run dev

# Deploy to production
pnpm run deploy

# View logs
wrangler tail

Basic Configuration

javascript

// wrangler-config.js
const providerConfig = {
  azure: {
    base_url: 'https://your-resource.openai.azure.com/openai/deployments/your-deployment',
    api_keys: ['your-azure-key'],
  },
  deepseek: {
    base_url: 'https://api.deepseek.com/v1',
    api_keys: ['your-deepseek-key'],
  },
};

const modelProviderConfig = {
  'gpt-4o-mini': {
    providers: [
      { provider: 'azure', model: 'gpt-4o-mini' },
      { provider: 'deepseek', model: 'deepseek-chat' },
    ],
  },
};

Basic Usage

python

from openai import OpenAI

client = OpenAI(
    base_url="https://your-proxy.workers.dev/v1",
    api_key="your-service-api-key"
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)

Documentation Sections

📚 Getting Started

Perfect for newcomers to understand the basics and get their first deployment running.

Installation - Step-by-step setup instructions
Configuration - Provider and model configuration
Quick Start - 60-second deployment guide

🚀 Advanced Usage

Deep dive into advanced features and optimization techniques.

Multiple API Keys - Scale beyond rate limits
Load Balancing - Optimize request distribution
Error Handling - Robust error management
Monitoring - Track usage and performance

🔌 Integrations

Learn how to integrate with popular tools and frameworks.

OpenAI SDK - Drop-in replacement setup
Anthropic SDK - Claude integration
Claude Code - Development tool integration
Langchain - Framework integration

📊 API Reference

Complete API documentation with examples and best practices.

Chat Completions - Text generation API
Embeddings - Vector embeddings API
Models - Available models endpoint
Responses API - Next-gen response API

🔑 Provider Guides

Detailed setup instructions for each supported provider.

Azure OpenAI - Enterprise OpenAI service
DeepSeek - High-performance models
Aliyun DashScope - Alibaba Cloud AI
DeepInfra - Open-source model inference

Best Practices

🎯 Configuration

Use multiple providers for the same model to ensure redundancy
Configure multiple API keys per provider to scale beyond rate limits
Set up proper fallback chains for critical applications

🔐 Security

Keep your configuration files secure and out of version control
Use environment variables for sensitive data in production
Regularly rotate your API keys

📈 Performance

Monitor your usage patterns to optimize provider selection
Use appropriate models for different use cases
Implement proper caching strategies in your applications

💰 Cost Management

Take advantage of free tiers from multiple providers
Monitor usage to avoid unexpected charges
Use cheaper models for development and testing

Community & Support

🤝 Getting Help

Documentation: You're reading it! Search for specific topics
GitHub Issues: Report bugs and request features
Discussions: Ask questions and share ideas with the community

🌟 Contributing

We welcome contributions! Check out our:

Next Steps

Choose your path based on your experience level:

🆕 New Users

Complete the Quick Start tutorial
Read about Configuration
Explore Provider Setup

🔧 Developers

Review the API Reference
Check out Integration Guides
Learn about Advanced Usage

🏢 Enterprise Users

Ready to dive in? Start with our Quick Start guide and have your proxy running in under 60 seconds!

Guide ​

Overview ​

Getting Started ​

New to Apex AI Proxy? ​

Core Concepts ​

🔄 Provider Aggregation ​

🔑 Unified Authentication ​

🤖 OpenAI Compatibility ​

Architecture ​

Key Features ​

🆓 Completely Free ​

🔄 Smart Load Balancing ​

💰 Cost Optimization ​

🛡️ Robust Error Handling ​

🌐 Multi-Protocol Support ​

Quick Reference ​

Essential Commands ​

Basic Configuration ​

Basic Usage ​

Documentation Sections ​

📚 Getting Started ​

🚀 Advanced Usage ​

🔌 Integrations ​

📊 API Reference ​

🔑 Provider Guides ​

Best Practices ​

🎯 Configuration ​

🔐 Security ​

📈 Performance ​

💰 Cost Management ​

Community & Support ​

🤝 Getting Help ​

🌟 Contributing ​

Next Steps ​

🆕 New Users ​

🔧 Developers ​

🏢 Enterprise Users ​