Model Capabilities
A large language model (LLM) is an AI model based on deep learning and natural language processing technologies. Trained on massive amounts of text data, it can understand, generate, and process human language. Its main capabilities include:- Text Generation Generates logically coherent text content based on context and adjusts the output style as needed.
- Language Understanding Accurately understands the meaning of input text and supports conversations that incorporate context.
- Text Translation Provides cross-language generation and understanding capabilities, enabling text translation between different languages.
- Knowledge Q&A Has a rich knowledge base and can answer questions in various fields such as culture, science, and history.
- Code Understanding and Generation Understands and generates code (such as Python, Java, C++, etc.), supports identifying code errors, and provides code suggestions.
- Text Classification and Summarization Understands complex sentences, performs information classification and extraction, and extracts key points from text for automatic summarization.
Model Selection
On JieKou AI, you can view the list of large language models supported by the platform and learn about each model’s basic introduction, pricing, and other information. Click a specific model to open its details page and try it online as needed. After fully experiencing models based on your specific tasks, you can compare model performance and choose the appropriate model.API Calls
JieKou AI provides API services compatible with the OpenAI API standard, making it easy to integrate into your existing applications.- ChatCompletion, supporting both streaming mode and standard mode.
- Completion, supporting both streaming mode and standard mode.
https://api.highwayapi.ai/openai, obtain and configure your API key, and update the model name as needed to access the large language model API service.
Code Examples
Python
Curl
Key Parameters
Basic Parameters
model: The model to call. You can view the list of large language models supported by the platform on JieKou AI.
Message Roles
Applies only to ChatCompletion.
messages: The input and output when interacting with a large model. Each message belongs to a role. Messages can help you obtain better outputs. You can try different approaches to achieve better results.
content: The message content.role: The role of the message author.system: Sets the AI role, telling the model the role or behavior it should adopt.user: The text entered by the user for the model.assistant: The response generated by the model. Users can also prefill examples to tell the model how it should respond to the current request.
name: Optional. Used to distinguish message authors with the same role.
Prompt
Applies only to Completion.
prompt: The prompt for generating a completion. It is the text information entered by the user for the large language model, used to clearly tell the model the problem to solve or the task to complete. It is also the foundation for the model to understand the requirements and generate relevant, accurate content.
Controlling Generation
Different parameter combinations can make the model generate content that better meets specific requirements. Text DiversityBothtemperatureandtop_pcan control the diversity of generated text. We recommend setting only one of them. The larger the value, the more diverse the generated text. The smaller the value, the more deterministic the generated text.
temperature: Sampling temperature, which adjusts the randomness of generated text.top_p: Nucleus sampling, which controls the cumulative probability of candidate words.top_k: Limits the number of candidate words.
presence_penalty: Presence penalty, which controls the degree of content repetition when the model generates text. If a Token has already appeared in the text, it will be penalized, causing the model to introduce more new Tokens.frequency_penalty: Frequency penalty, which controls how often certain words appear in the generated text. It penalizes Tokens each time they appear in the text, thereby reducing the probability of these Tokens appearing in future generation and preventing the model from repeatedly using the same Tokens.repetition_penalty: Repetition penalty value, used to suppress or encourage repetition.
Output Limits
max_tokens: The maximum number of Tokens returned in a single request. If the number of Tokens generated by the model exceeds the value ofmax_tokens, the truncated content will be returned.stream: Controls whether the output is streamed. For some models that produce a large amount of output, we recommend setting this to streaming output to prevent overly long output from causing a timeout.true: Streaming output, meaning output is returned as it is generated. The model returns a chunk each time it generates part of the content.false: Returns the result all at once after the model has generated all content.
stop: Stop sequence. When the text generated by the model contains the string set instop, the model stops outputting.