增加环绕侦察场景适配

This commit is contained in:
2026-01-08 15:44:38 +08:00
parent 3eba1f962b
commit 10c5bb5a8a
5441 changed files with 40219 additions and 379695 deletions

View File

@@ -72,7 +72,7 @@ class Speech(SyncAPIResource):
model:
One of the available [TTS models](https://platform.openai.com/docs/models#tts):
`tts-1`, `tts-1-hd` or `gpt-4o-mini-tts`.
`tts-1`, `tts-1-hd`, `gpt-4o-mini-tts`, or `gpt-4o-mini-tts-2025-12-15`.
voice: The voice to use when generating the audio. Supported voices are `alloy`, `ash`,
`ballad`, `coral`, `echo`, `fable`, `onyx`, `nova`, `sage`, `shimmer`, and
@@ -168,7 +168,7 @@ class AsyncSpeech(AsyncAPIResource):
model:
One of the available [TTS models](https://platform.openai.com/docs/models#tts):
`tts-1`, `tts-1-hd` or `gpt-4o-mini-tts`.
`tts-1`, `tts-1-hd`, `gpt-4o-mini-tts`, or `gpt-4o-mini-tts-2025-12-15`.
voice: The voice to use when generating the audio. Supported voices are `alloy`, `ash`,
`ballad`, `coral`, `echo`, `fable`, `onyx`, `nova`, `sage`, `shimmer`, and

View File

@@ -91,8 +91,9 @@ class Transcriptions(SyncAPIResource):
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model: ID of the model to use. The options are `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, and `whisper-1` (which is powered by our open source
Whisper V2 model).
`gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `whisper-1`
(which is powered by our open source Whisper V2 model), and
`gpt-4o-transcribe-diarize`.
chunking_strategy: Controls how the audio is cut into chunks. When set to `"auto"`, the server
first normalizes loudness and then uses voice activity detection (VAD) to choose
@@ -102,8 +103,9 @@ class Transcriptions(SyncAPIResource):
include: Additional information to include in the transcription response. `logprobs` will
return the log probabilities of the tokens in the response to understand the
model's confidence in the transcription. `logprobs` only works with
response_format set to `json` and only with the models `gpt-4o-transcribe` and
`gpt-4o-mini-transcribe`.
response_format set to `json` and only with the models `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, and `gpt-4o-mini-transcribe-2025-12-15`. This field is
not supported when using `gpt-4o-transcribe-diarize`.
language: The language of the input audio. Supplying the input language in
[ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) (e.g. `en`)
@@ -239,8 +241,9 @@ class Transcriptions(SyncAPIResource):
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model: ID of the model to use. The options are `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, `whisper-1` (which is powered by our open source
Whisper V2 model), and `gpt-4o-transcribe-diarize`.
`gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `whisper-1`
(which is powered by our open source Whisper V2 model), and
`gpt-4o-transcribe-diarize`.
stream: If set to true, the model response data will be streamed to the client as it is
generated using
@@ -261,9 +264,9 @@ class Transcriptions(SyncAPIResource):
include: Additional information to include in the transcription response. `logprobs` will
return the log probabilities of the tokens in the response to understand the
model's confidence in the transcription. `logprobs` only works with
response_format set to `json` and only with the models `gpt-4o-transcribe` and
`gpt-4o-mini-transcribe`. This field is not supported when using
`gpt-4o-transcribe-diarize`.
response_format set to `json` and only with the models `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, and `gpt-4o-mini-transcribe-2025-12-15`. This field is
not supported when using `gpt-4o-transcribe-diarize`.
known_speaker_names: Optional list of speaker names that correspond to the audio samples provided in
`known_speaker_references[]`. Each entry should be a short identifier (for
@@ -346,8 +349,9 @@ class Transcriptions(SyncAPIResource):
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model: ID of the model to use. The options are `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, `whisper-1` (which is powered by our open source
Whisper V2 model), and `gpt-4o-transcribe-diarize`.
`gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `whisper-1`
(which is powered by our open source Whisper V2 model), and
`gpt-4o-transcribe-diarize`.
stream: If set to true, the model response data will be streamed to the client as it is
generated using
@@ -368,9 +372,9 @@ class Transcriptions(SyncAPIResource):
include: Additional information to include in the transcription response. `logprobs` will
return the log probabilities of the tokens in the response to understand the
model's confidence in the transcription. `logprobs` only works with
response_format set to `json` and only with the models `gpt-4o-transcribe` and
`gpt-4o-mini-transcribe`. This field is not supported when using
`gpt-4o-transcribe-diarize`.
response_format set to `json` and only with the models `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, and `gpt-4o-mini-transcribe-2025-12-15`. This field is
not supported when using `gpt-4o-transcribe-diarize`.
known_speaker_names: Optional list of speaker names that correspond to the audio samples provided in
`known_speaker_references[]`. Each entry should be a short identifier (for
@@ -535,8 +539,9 @@ class AsyncTranscriptions(AsyncAPIResource):
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model: ID of the model to use. The options are `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, `whisper-1` (which is powered by our open source
Whisper V2 model), and `gpt-4o-transcribe-diarize`.
`gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `whisper-1`
(which is powered by our open source Whisper V2 model), and
`gpt-4o-transcribe-diarize`.
chunking_strategy: Controls how the audio is cut into chunks. When set to `"auto"`, the server
first normalizes loudness and then uses voice activity detection (VAD) to choose
@@ -548,9 +553,9 @@ class AsyncTranscriptions(AsyncAPIResource):
include: Additional information to include in the transcription response. `logprobs` will
return the log probabilities of the tokens in the response to understand the
model's confidence in the transcription. `logprobs` only works with
response_format set to `json` and only with the models `gpt-4o-transcribe` and
`gpt-4o-mini-transcribe`. This field is not supported when using
`gpt-4o-transcribe-diarize`.
response_format set to `json` and only with the models `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, and `gpt-4o-mini-transcribe-2025-12-15`. This field is
not supported when using `gpt-4o-transcribe-diarize`.
known_speaker_names: Optional list of speaker names that correspond to the audio samples provided in
`known_speaker_references[]`. Each entry should be a short identifier (for
@@ -679,8 +684,9 @@ class AsyncTranscriptions(AsyncAPIResource):
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model: ID of the model to use. The options are `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, `whisper-1` (which is powered by our open source
Whisper V2 model), and `gpt-4o-transcribe-diarize`.
`gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `whisper-1`
(which is powered by our open source Whisper V2 model), and
`gpt-4o-transcribe-diarize`.
stream: If set to true, the model response data will be streamed to the client as it is
generated using
@@ -701,9 +707,9 @@ class AsyncTranscriptions(AsyncAPIResource):
include: Additional information to include in the transcription response. `logprobs` will
return the log probabilities of the tokens in the response to understand the
model's confidence in the transcription. `logprobs` only works with
response_format set to `json` and only with the models `gpt-4o-transcribe` and
`gpt-4o-mini-transcribe`. This field is not supported when using
`gpt-4o-transcribe-diarize`.
response_format set to `json` and only with the models `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, and `gpt-4o-mini-transcribe-2025-12-15`. This field is
not supported when using `gpt-4o-transcribe-diarize`.
known_speaker_names: Optional list of speaker names that correspond to the audio samples provided in
`known_speaker_references[]`. Each entry should be a short identifier (for
@@ -786,8 +792,9 @@ class AsyncTranscriptions(AsyncAPIResource):
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model: ID of the model to use. The options are `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, `whisper-1` (which is powered by our open source
Whisper V2 model), and `gpt-4o-transcribe-diarize`.
`gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `whisper-1`
(which is powered by our open source Whisper V2 model), and
`gpt-4o-transcribe-diarize`.
stream: If set to true, the model response data will be streamed to the client as it is
generated using
@@ -808,9 +815,9 @@ class AsyncTranscriptions(AsyncAPIResource):
include: Additional information to include in the transcription response. `logprobs` will
return the log probabilities of the tokens in the response to understand the
model's confidence in the transcription. `logprobs` only works with
response_format set to `json` and only with the models `gpt-4o-transcribe` and
`gpt-4o-mini-transcribe`. This field is not supported when using
`gpt-4o-transcribe-diarize`.
response_format set to `json` and only with the models `gpt-4o-transcribe`,
`gpt-4o-mini-transcribe`, and `gpt-4o-mini-transcribe-2025-12-15`. This field is
not supported when using `gpt-4o-transcribe-diarize`.
known_speaker_names: Optional list of speaker names that correspond to the audio samples provided in
`known_speaker_references[]`. Each entry should be a short identifier (for