# OpenAI Compatible API OpenAI-compatible endpoints. See [OpenAI API Reference](https://platform.openai.com/docs/api-reference) for full documentation. ## Models **Endpoints:** `GET /v1/models`, `GET /v1/models/{id}` List available models or get a specific model by ID. ## Chat Completions **Endpoint:** `POST /v1/chat/completions` | Parameter | Type | Description | |--------------------------|--------------|-----------------------------------------------------| | `model` | String | Model ID | | `messages` | Array | Conversation messages | | `stream` | Boolean | Enable streaming | | `stream_options` | Object | Streaming options (`include_usage`) | | `temperature` | Float | Sampling temperature | | `max_completion_tokens` | Integer | Maximum tokens to generate | | `stop` | String/Array | Stop sequences | | `tools` | Array | Available tools/functions | | `tool_choice` | String/Object| Tool selection mode (auto, none, required) | | `parallel_tool_calls` | Boolean | Enable parallel tool execution | | `response_format` | Object | Response format (text, json_object, json_schema) | | `reasoning_effort` | String | Reasoning effort (none, minimal, low, medium, high, xhigh, max) | | `verbosity` | String | Output verbosity (low, medium, high) | ## Responses **Endpoint:** `POST /v1/responses` | Parameter | Type | Description | |------------------------|--------------|-----------------------------------------------------| | `model` | String | Model ID | | `input` | String/Array | Input text or conversation messages | | `stream` | Boolean | Enable streaming | | `temperature` | Float | Sampling temperature | | `max_output_tokens` | Integer | Maximum tokens to generate | | `tools` | Array | Available tools/functions | | `tool_choice` | String/Object| Tool selection mode (auto, none, required) | | `parallel_tool_calls` | Boolean | Enable parallel tool execution | | `instructions` | String | System instructions | | `reasoning` | Object | Reasoning configuration (`effort`, `summary`) | | `text` | Object | Text format and verbosity configuration | | `context_management` | Array | Context management (compaction with threshold) | | `include` | Array | Include options (e.g. `reasoning.encrypted_content`)| | `truncation` | String | Truncation mode (auto, disabled) | ## Embeddings **Endpoint:** `POST /v1/embeddings` | Parameter | Type | Description | |--------------------|------------- |---------------------------------| | `model` | String | Model ID | | `input` | String/Array | Text(s) to embed | | `encoding_format` | String | Encoding format (float, base64) | | `dimensions` | Integer | Output dimensions | ## Audio Speech (TTS) **Endpoint:** `POST /v1/audio/speech` | Parameter | Type | Description | |--------------------|--------|----------------------| | `model` | String | Model ID | | `input` | String | Text to synthesize | | `voice` | String | Voice ID | | `speed` | Float | Playback speed | | `response_format` | String | Audio format | | `instructions` | String | Voice instructions | ## Audio Transcriptions **Endpoint:** `POST /v1/audio/transcriptions` | Parameter | Type | Description | |-------------|--------|--------------------------------------------------| | `model` | String | Model ID | | `file` | File | Audio file | | `language` | String | Audio language | | `prompt` | String | Optional text to guide transcription style/vocab | ## Image Generation **Endpoint:** `POST /v1/images/generations` | Parameter | Type | Description | |--------------------|--------|--------------------| | `model` | String | Model ID | | `prompt` | String | Image description | | `response_format` | String | url or b64_json | ## Image Edit **Endpoint:** `POST /v1/images/edits` | Parameter | Type | Description | |--------------------|--------|---------------------| | `model` | String | Model ID | | `prompt` | String | Edit instructions | | `image` | File | Image to edit | | `response_format` | String | url or b64_json | --- # Anthropic Compatible API Anthropic-compatible endpoints. See [Anthropic API Reference](https://docs.anthropic.com/en/api/messages) for full documentation. ## Messages **Endpoint:** `POST /v1/messages` | Parameter | Type | Description | |-----------------------|--------------|------------------------------------------------| | `model` | String | Model ID | | `messages` | Array | Conversation messages | | `system` | String/Array | System prompt | | `max_tokens` | Integer | Maximum tokens to generate | | `stream` | Boolean | Enable streaming | | `temperature` | Float | Sampling temperature | | `top_p` | Float | Nucleus sampling | | `top_k` | Integer | Top-k sampling | | `stop_sequences` | Array | Stop sequences | | `tools` | Array | Available tools/functions | | `tool_choice` | Object | Tool selection (auto, any, tool, none) | | `metadata` | Object | Request metadata (`user_id`) | | `output_format` | Object | Structured output with JSON schema | | `thinking` | Object | Thinking configuration (type, budget_tokens) | | `context_management` | Object | Context management with compaction edits | ## Count Tokens **Endpoint:** `POST /v1/messages/count_tokens` | Parameter | Type | Description | |-------------|------------- |-----------------------| | `model` | String | Model ID | | `messages` | Array | Conversation messages | | `system` | String/Array | System prompt | | `tools` | Array | Tool definitions | --- # Gemini Compatible API Gemini-compatible endpoints. See [Gemini API Reference](https://ai.google.dev/api/generate-content) for full documentation. ## Generate Content **Endpoint:** `POST /v1beta/models/{model}:generateContent` | Parameter | Type | Description | |----------------------|--------|----------------------------------------------| | `contents` | Array | Conversation contents with parts | | `systemInstruction` | Object | System instruction content | | `tools` | Array | Available tools/functions | | `toolConfig` | Object | Tool configuration (mode: AUTO, ANY, NONE) | | `generationConfig` | Object | Generation parameters | | `safetySettings` | Array | Safety filter settings | **Generation Config:** | Parameter | Type | Description | |----------------------|---------|---------------------------------------------| | `stopSequences` | Array | Stop sequences | | `temperature` | Float | Sampling temperature | | `topP` | Float | Nucleus sampling | | `topK` | Integer | Top-k sampling | | `maxOutputTokens` | Integer | Maximum tokens to generate | | `candidateCount` | Integer | Number of candidates to generate | | `responseMimeType` | String | Response MIME type | | `responseSchema` | Object | JSON schema for structured output | | `responseJsonSchema` | Object | JSON schema (alternative format) | | `thinkingConfig` | Object | Thinking configuration (level, budget, includeThoughts) | ## Stream Generate Content **Endpoint:** `POST /v1beta/models/{model}:streamGenerateContent` Same parameters as Generate Content. Returns Server-Sent Events when `?alt=sse` query parameter is provided, otherwise returns a JSON array. ## Count Tokens **Endpoint:** `POST /v1beta/models/{model}:countTokens` | Parameter | Type | Description | |----------------------|--------|-----------------------------------| | `contents` | Array | Conversation contents with parts | | `systemInstruction` | Object | System instruction content | | `tools` | Array | Tool definitions | --- # Utility APIs ## Extract Extract text from files or URLs. **Endpoint:** `POST /v1/extract` | Parameter | Type | Description | |-------------|--------|-----------------------------------------| | `model` | String | Model/provider to use | | `file` | File | Document to extract text from | | `url` | String | URL to scrape and extract | | `schema` | JSON | JSON schema for structured extraction | **Headers:** | Header | Value | Description | |-----------|----------------------|-------------------------------------------| | `Accept` | `application/json` | Structured output with OCR details | ```bash # From file (text output) curl -X POST -F "file=@document.pdf" http://localhost:8080/v1/extract # From URL (text output) curl -X POST -F "url=https://example.com" http://localhost:8080/v1/extract # JSON output with OCR details (pages, blocks, polygons) curl -X POST -H "Accept: application/json" -F "file=@document.pdf" http://localhost:8080/v1/extract # With JSON schema for structured extraction curl -X POST -F "file=@document.pdf" -F 'schema={"type":"object","properties":{"name":{"type":"string"}}}' http://localhost:8080/v1/extract ``` ## Render Generate images from text descriptions. **Endpoint:** `POST /v1/render` | Parameter | Type | Description | |--------------|----------|-----------------------------| | `model` | String | Model/provider to use | | `prompt` | String | Text description to render | | `file` | File(s) | Optional reference images | ```bash curl -X POST -F "input=a sunset over mountains" http://localhost:8080/v1/render ``` ## Search Search for information. **Endpoint:** `POST /v1/search` (alias: `/v1/retrieve`) | Parameter | Type | Description | |-------------|--------|------------------------| | `model` | String | Model/provider to use | | `query` | String | Search query | ```bash curl -X POST -F "input=your query" http://localhost:8080/v1/search ``` ## Research Deep research on a topic. **Endpoint:** `POST /v1/research` | Parameter | Type | Description | |----------------|--------|--------------------------| | `model` | String | Model/provider to use | | `instructions` | String | Research topic/question | ```bash curl -X POST -F "input=explain quantum computing" http://localhost:8080/v1/research ``` ## Rerank Rerank texts by relevance to a query. **Endpoint:** `POST /v1/rerank` | Parameter | Type | Description | |-------------|---------|-----------------------------| | `model` | String | Model/provider to use | | `query` | String | Search query | | `texts` | Array | List of texts to rerank | | `limit` | Integer | Maximum results to return | ```bash curl -X POST -H "Content-Type: application/json" \ -d '{"query":"search term","texts":["text1","text2"],"limit":5}' \ http://localhost:8080/v1/rerank ``` ## Segment Split text into segments/chunks. **Endpoint:** `POST /v1/segment` | Parameter | Type | Description | |--------------------|---------|-------------------------------| | `text` | String | Text to segment | | `file` | File | File to extract and segment | | `url` | String | URL to scrape and segment | | `segment_length` | Integer | Target segment length | | `segment_overlap` | Integer | Overlap between segments | | `model` | String | Model/provider to use | ```bash curl -X POST -F "input=long text here" -F "segment_length=500" http://localhost:8080/v1/segment ``` ## Summarize Summarize text content. **Endpoint:** `POST /v1/summarize` | Parameter | Type | Description | |-------------|--------|---------------------------------| | `model` | String | Model/provider to use | | `input` | String | Text to summarize | | `file` | File | File to extract and summarize | | `url` | String | URL to scrape and summarize | ```bash curl -X POST -F "input=long article text" http://localhost:8080/v1/summarize ``` ## Translate Translate text or files to another language. **Endpoint:** `POST /v1/translate` | Parameter | Type | Description | |-------------|--------|------------------------| | `model` | String | Model/provider to use | | `input` | String | Text to translate | | `file` | File | File to translate | | `language` | String | Target language | ```bash curl -X POST -F "input=Hello world" -F "language=de" http://localhost:8080/v1/translate ``` ## Transcribe Transcribe audio files to text. **Endpoint:** `POST /v1/transcribe` | Parameter | Type | Description | |----------------|--------|--------------------------------------------------| | `model` | String | Model/provider to use | | `file` | File | Audio file to transcribe | | `language` | String | Audio language (optional) | | `instructions` | String | Optional text to guide transcription style/vocab | ```bash curl -X POST -F "file=@audio.mp3" http://localhost:8080/v1/transcribe ``` ## MCP Proxy Proxy requests to configured MCP (Model Context Protocol) servers. **Endpoint:** `POST /v1/mcp/{id}`, `POST /v1/mcp/{id}/*` Forwards requests to the MCP server identified by `{id}`.