Create a chat completion
Authentication
Request
The ID of the model to use for this request. See LLM Gateway Overview for available models.
When true, responses are streamed as server-sent events (SSE). Supported on OpenAI models only.
Controls which (if any) function is called by the model.
Specifies the format of the model’s response. Use this to constrain the model to output valid JSON matching a schema. Supported by OpenAI (GPT-4.1, GPT-5.x), Gemini, and Claude models. Not supported by gpt-oss models.
An array of fallback objects. Each object must include a model and can optionally override any field from the original request. If the primary model fails, the LLM Gateway tries each fallback in order until one succeeds. See Specify fallback models for more details.
Configuration for fallback behavior, including retry and depth settings. See Specify fallback models for more details.
Response
A copy of the original request, excluding prompt and messages.