max-tokens |
(none) |
Long |
The maximum number of tokens that can be generated in the chat completion. |
n |
(none) |
Long |
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. |
presence-penalty |
(none) |
Double |
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. |
response-format |
(none) |
Enum |
The format of the response, e.g., 'text' or 'json_object'.
Possible values: |
seed |
(none) |
Long |
If specified, the model platform will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed. |
stop |
(none) |
String |
A CSV list of strings to pass as stop sequences to the model. |
system-prompt |
"You are a helpful assistant." |
String |
The system message of a chat. |
temperature |
(none) |
Double |
Controls the randomness or “creativity” of the output. Typical values are between 0.0 and 1.0. |
top-p |
(none) |
Double |
The probability cutoff for token selection. Usually, either temperature or topP are specified, but not both. |