A mock server for Large Language Model (LLM) APIs, supporting OpenAI, Ollama, Gemini and AWS Bedrock endpoints. Useful for testing and development without relying on actual LLM services.
-
Clone this repository:
git clone rancher-sandbox/rancher-ai-llm-mock.git cd rancher-ai-llm-mock -
Build and run with Go:
go run ./cmd/main.go
Or use Docker:
docker build -t llm-mock . docker run -p 8083:8083 llm-mockOr use Helm:
helm install llm-mock chart/llm-mock \ --namespace your-namespace \ --create-namespace
The server will start on http://localhost:8083
You can control the mock responses using the /v1/control endpoints:
-
POST /v1/control/push: Push a mock response to the queue. Example body:{ "agent": "Rancher", "text": { "chunks": ["Hello", " world!"] }, "tool": { "name": "example_tool", "args": [ { "key": "value" } ] } }- The
argsfield accepts either a single object or an array of objects (i.e. confirmation request payload for multiple resources). - The next model API call will stream text chunks as response and use tool for MCP invocation.
- The MCP tool must be one of the supported tools of the agent in request.
- If there are less than two agents configured in Rancher, the agent must not be provided.
- The
-
POST /v1/control/clear: Clear the mock response queue.
If the response queue is empty, default mock responses will be used.
See the OpenAPI spec for full API details.