majiayu000
diff --git a/‎CLAUDE.md‎
Lines changed: 122 additions & 0 deletions b/‎CLAUDE.md‎
Lines changed: 122 additions & 0 deletions
diff --git a/‎SDK_README.md‎
Lines changed: 181 additions & 0 deletions b/‎SDK_README.md‎
Lines changed: 181 additions & 0 deletions
@@ -0,0 +1,122 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Essential Commands
+
+### Development Commands
+- **Start development**: `make dev` or `cargo run` (auto-loads config/gateway.yaml)
+- **Build**: `cargo build --all-features` 
+- **Test**: `cargo test --all-features`
+- **Lint**: `cargo clippy --all-targets --all-features -- -D warnings`
+- **Format**: `cargo fmt --all`
+- **Quick start**: `make start` (fastest way to start the gateway)
+
+### Testing Commands
+- **All tests**: `make test`
+- **Unit tests only**: `make test-unit` 
+- **Integration tests**: `make test-integration`
+- **Test coverage**: `make test-coverage`
+- **Single test**: `cargo test <test_name> --all-features`
+
+### Development Services
+- **Start dev services**: `make dev-services` (starts PostgreSQL, Redis)
+- **Stop dev services**: `make dev-stop`
+- **Database migration**: `make db-migrate`
+- **Reset database**: `make db-reset`
+
+## Architecture Overview
+
+This is a **high-performance AI Gateway** written in Rust that provides OpenAI-compatible APIs with intelligent routing across 20+ AI providers.
+
+### Core Components
+
+**Gateway Architecture**: Modular, trait-based design with dependency injection
+- `src/core/` - Central orchestrator and business logic
+- `src/server/` - Actix-web HTTP server with middleware pipeline  
+- `src/auth/` - Multi-layered authentication (JWT, API keys, RBAC)
+- `src/core/providers/` - Pluggable provider system (OpenAI, Anthropic, Azure, Google, etc.)
+- `src/core/router/` - Intelligent routing with multiple strategies
+- `src/storage/` - Multi-backend storage (PostgreSQL, Redis, S3, Vector DB)
+- `src/monitoring/` - Observability (Prometheus, tracing, health checks)
+
+### Key Design Patterns
+- **Async-first**: All I/O is non-blocking using Tokio
+- **Trait-based abstractions**: Pluggable components via traits
+- **Error handling**: Comprehensive error types with context preservation
+- **Configuration**: Type-safe config models with Default implementations
+
+### Provider Integration
+- **Unified Provider trait**: Common interface for all AI providers
+- **Format conversion**: Automatic translation between OpenAI and provider-specific APIs
+- **Health monitoring**: Per-provider health checks and failover
+- **Cost calculation**: Built-in token counting and cost estimation
+
+### Request Flow
+1. HTTP Request → Authentication → Authorization → Router → Provider → Response
+2. Middleware pipeline handles auth, logging, metrics, and transformations
+3. Intelligent routing selects optimal provider based on health, latency, cost
+
+## Configuration
+
+- **Main config**: `config/gateway.yaml` (auto-loaded by default)
+- **Example config**: `config/gateway.yaml.example`
+- **Environment variables**: Override config values with `${ENV_VAR}` syntax
+- **Config validation**: `make config-validate`
+
+## Important Files
+
+- `src/main.rs` - Application entry point
+- `src/lib.rs` - Library entry point with core Gateway struct
+- `Cargo.toml` - Dependencies and features (use `--all-features` for development)
+- `Makefile` - All development commands and workflows
+- `config/gateway.yaml` - Main configuration file
+
+## Binaries
+
+- `gateway` (default) - Main gateway server
+- `google-gateway` - Specialized Google API gateway
+
+## Features
+
+The codebase uses Cargo features extensively:
+- **Storage**: `postgres`, `sqlite`, `redis`, `s3`
+- **Monitoring**: `metrics`, `tracing` 
+- **Advanced**: `vector-db`, `websockets`, `analytics`, `enterprise`
+- **Development**: Use `--all-features` flag for full functionality
+
+## Database & Storage
+
+- **Primary DB**: PostgreSQL with SQLx migrations
+- **Cache**: Redis for high-speed operations
+- **File storage**: S3-compatible object storage
+- **Vector DB**: Optional Qdrant integration for semantic caching
+
+## Testing Architecture
+
+- Unit tests in each module (`#[cfg(test)]`)
+- Integration tests in `tests/integration_tests.rs`
+- Postman collections for API testing
+- Mock implementations for external services
+
+## Common Development Patterns
+
+1. **Adding new providers**: Implement the `Provider` trait in `src/core/providers/`
+2. **New API endpoints**: Add routes in `src/server/routes/`
+3. **Authentication**: Extend auth modules in `src/auth/`
+4. **Configuration**: Update models in `src/config/models/`
+5. **Monitoring**: Add metrics in respective modules
+
+## Docker & Deployment
+
+- **Docker build**: `make docker`
+- **Development stack**: `make docker-compose-dev`
+- **Production**: `make docker-compose`
+- **Kubernetes**: `make k8s-apply`
+
+## Performance Characteristics
+
+- **Throughput**: 10,000+ requests/second
+- **Latency**: <10ms routing overhead
+- **Memory**: ~50MB base footprint
+- **Architecture**: Fully async, connection pooling, zero-copy where possible
@@ -0,0 +1,181 @@
+# LiteLLM Unified SDK
+
+这是基于LiteLLM-RS项目构建的统一LLM Provider SDK，提供了简化的接口来与多个LLM提供商进行交互。
+
+## 🎯 特性
+
+- **统一接口**: 使用相同的API与OpenAI、Anthropic、Azure等多个provider交互
+- **类型安全**: 基于Rust的强类型系统，编译时错误检查
+- **简化配置**: 灵活的配置选项，支持环境变量和配置文件
+- **高性能**: 基于现有的LiteLLM-RS高性能架构
+- **易于扩展**: 模块化设计，轻松添加新的provider
+
+## 📁 文件结构
+
+```
+src/sdk/
+├── mod.rs              # SDK主模块
+├── simple_client.rs    # 简化的LLM客户端
+├── config.rs           # 配置系统
+├── types.rs            # 数据类型定义
+├── errors.rs           # 错误处理
+├── providers.rs        # Provider注册器
+├── middleware.rs       # 中间件系统（占位符）
+├── cache.rs           # 缓存系统（占位符）
+├── router.rs          # 路由系统（占位符）
+└── monitoring.rs      # 监控系统（占位符）
+```
+
+## 🚀 快速开始
+
+### 1. 基本使用
+
+```rust
+use litellm_rs::sdk::*;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+    // 创建配置
+    let config = ConfigBuilder::new()
+        .add_openai("openai", "your-api-key")
+        .add_anthropic("anthropic", "your-anthropic-key")
+        .default_provider("openai")
+        .build();
+    
+    // 创建客户端
+    let client = LLMClient::new(config)?;
+    
+    // 发送消息
+    let messages = vec![
+        Message {
+            role: Role::User,
+            content: Some(Content::Text("Hello!".to_string())),
+            name: None,
+            tool_calls: None,
+        }
+    ];
+    
+    let response = client.chat(messages).await?;
+    println!("Response: {:?}", response);
+    
+    Ok(())
+}
+```
+
+### 2. 从环境变量配置
+
+```rust
+use litellm_rs::sdk::*;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+    // 从环境变量加载配置
+    // 需要设置 OPENAI_API_KEY 或 ANTHROPIC_API_KEY
+    let config = ClientConfig::from_env()?;
+    let client = LLMClient::new(config)?;
+    
+    // 使用客户端...
+    Ok(())
+}
+```
+
+### 3. 从配置文件加载
+
+```yaml
+# config.yaml
+default_provider: "openai"
+
+providers:
+  - id: "openai"
+    provider_type: "openai"
+    name: "OpenAI"
+    api_key: "${OPENAI_API_KEY}"
+    models: ["gpt-4", "gpt-3.5-turbo"]
+    enabled: true
+    weight: 1.0
+    rate_limit_rpm: 3000
+    rate_limit_tpm: 250000
+
+  - id: "anthropic"
+    provider_type: "anthropic"
+    name: "Anthropic"
+    api_key: "${ANTHROPIC_API_KEY}"
+    models: ["claude-3-opus-20240229"]
+    enabled: true
+    weight: 1.0
+
+settings:
+  timeout: 30
+  max_retries: 3
+  enable_logging: true
+```
+
+```rust
+let config = ClientConfig::from_file("config.yaml")?;
+let client = LLMClient::new(config)?;
+```
+
+## 📊 配置选项
+
+### ClientConfig
+
+- `default_provider`: 默认使用的provider ID
+- `providers`: provider配置列表
+- `settings`: 全局设置
+
+### ProviderConfig
+
+- `id`: provider唯一标识
+- `provider_type`: provider类型（OpenAI、Anthropic等）
+- `name`: 显示名称
+- `api_key`: API密钥
+- `base_url`: 自定义API端点（可选）
+- `models`: 支持的模型列表
+- `enabled`: 是否启用
+- `weight`: 负载均衡权重
+- `rate_limit_rpm`: 每分钟请求限制
+- `rate_limit_tpm`: 每分钟token限制
+
+## 🔧 支持的Provider类型
+
+- `OpenAI`: OpenAI API
+- `Anthropic`: Anthropic Claude API
+- `Azure`: Azure OpenAI Service
+- `Google`: Google AI API
+- `Cohere`: Cohere API
+- `HuggingFace`: HuggingFace API
+- `Ollama`: Ollama本地API
+- `AwsBedrock`: AWS Bedrock
+- `GoogleVertex`: Google Vertex AI
+- `Mistral`: Mistral AI
+
+## 🧪 运行示例
+
+```bash
+# 设置环境变量
+export OPENAI_API_KEY="your-openai-key"
+export ANTHROPIC_API_KEY="your-anthropic-key"
+
+# 运行示例
+cargo run --example sdk_example
+```
+
+## 🔮 未来计划
+
+当前的SDK实现了基础功能，未来计划添加：
+
+1. **完整的Provider集成**: 与现有LiteLLM-RS的provider系统完整集成
+2. **流式响应**: 支持真实的流式聊天
+3. **中间件系统**: 请求/响应处理管道
+4. **缓存系统**: 智能缓存和语义缓存
+5. **负载均衡**: 多种路由策略
+6. **监控指标**: 详细的性能和使用指标
+7. **错误重试**: 智能重试和熔断机制
+
+## 🤝 贡献
+
+欢迎提交Issue和Pull Request来改进这个SDK！
+
+## 📄 许可证
+
+与主项目相同的MIT许可证。