Reasoning Models

Feature Overview

Reasoning models are advanced language models optimized for complex problem-solving and reasoning tasks. They improve answer accuracy by outputting detailed reasoning steps (chain of thought).

Typical Use Cases

Complex problem-solving: Suitable for scenarios that require step-by-step derivation and clear logical steps, such as mathematics and scientific reasoning.
Decision support systems: Provide detailed reasoning processes to support decision analysis and help users understand the logic behind decisions.
Education and training: Help users learn and understand complex knowledge by providing detailed derivation processes.

Installation and Preparation

Before using reasoning models, make sure the latest version of the OpenAI SDK is installed:

pip install -U openai

API Usage

Use reasoning models by calling the /chat/completions endpoint.

Request Parameters

max_tokens: Sets the maximum number of tokens in the model output.
temperature: Recommended to set between 0.5 and 0.7 (0.6 recommended) to balance creativity and logical consistency.
top_p: Recommended to set to 0.95.

Example Request Code

Streaming Output Request

from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.highwayapi.ai/openai")
messages = [
    {"role": "user", "content": "Explain Newton's second law."}
]

response = client.chat.completions.create(
    model="deepseek/deepseek-r1",
    messages=messages,
    stream=True,
    max_tokens=4096
)

content = ""
reasoning_content = ""
for chunk in response:
    if chunk.choices[0].delta.content:
        content += chunk.choices[0].delta.content
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content

print("Final answer:", content)
print("Reasoning process:", reasoning_content)

Non-Streaming Output Request

response = client.chat.completions.create(
    model="deepseek/deepseek-r1",
    messages=[
        {"role": "user", "content": "What is the greenhouse effect? How can it be mitigated?"}
    ],
    stream=False,
    max_tokens=4096
)

content = response.choices[0].message.content
reasoning_content = response.choices[0].message.reasoning_content

print("Final answer:", content)
print("Reasoning process:", reasoning_content)

Context Management

The reasoning content returned by the model is not automatically appended to the next turn of the conversation. Users need to manage conversation history manually:

messages.append({"role": "assistant", "content": content})
messages.append({"role": "user", "content": "Continue explaining the solution."})

Supported Models

Billing

Billing is based on the number of input and output tokens.
For specific pricing standards and conversion rules, please check the model details page.

Notes and Best Practices

Do not add reasoning instructions in the system message. Instead, clearly specify the instructions directly in the user message.
For math problems, clearly state the requirement, such as: “Please reason step by step and clearly state the final answer.”
To prevent the model from skipping the reasoning stage, it is recommended to force the model to add a newline before outputting.

Getting Started

LLM API

Model Providers

Model Features

Third-party Tool Setup

Feature Overview

Typical Use Cases

Installation and Preparation

API Usage

Request Parameters

Example Request Code

Streaming Output Request

Non-Streaming Output Request

Context Management

Supported Models

Billing

Notes and Best Practices

​Feature Overview

​Typical Use Cases

​Installation and Preparation

​API Usage

​Request Parameters

​Example Request Code

​Streaming Output Request

​Non-Streaming Output Request

​Context Management

​Supported Models

​Billing

​Notes and Best Practices

Feature Overview

Typical Use Cases

Installation and Preparation

API Usage

Request Parameters

Example Request Code

Streaming Output Request

Non-Streaming Output Request

Context Management

Supported Models

Billing

Notes and Best Practices