- 接入火山引擎豆包 Chat API 生成儿童故事(SSE 流式进度) - 新增 /api/stories 接口加载历史故事到书架 - 新增 /api/playlist 接口加载历史歌曲到唱片架 - 书架排序:预设故事在前,AI 生成在后 - AI 生成的故事显示"暂无封面"淡紫渐变占位 - 保存故事时传回真实标题+内容(不再用 mock) - 修复 Windows GBK 编码导致的中文乱码问题 - 新增 MusicGenerationService 单例管理音乐生成 - 音乐页心情卡片 UI 重做 + 歌词可读性优化 - 添加豆包 API 参考文档和故事创作 prompt Co-authored-by: Cursor <cursoragent@cursor.com>
552 lines
24 KiB
Markdown
552 lines
24 KiB
Markdown
根据输入的提示词生成连贯清晰的文本内容。
|
||
:::tip
|
||
方舟平台的新用户?获取 API Key 及 开通模型等准备工作,请参见 [快速入门](/docs/82379/1399008)。
|
||
:::
|
||
<span id="18cf565a"></span>
|
||
# 快速开始
|
||
|
||
<span aceTableMode="list" aceTableWidth="3,6"></span>
|
||
|输入 |输出预览 |
|
||
|---|---|
|
||
|```Plain|```Plain|\
|
||
|请将下面内容进行结构化处理:火山方舟是火山引擎推出的大模型服务平台,提供模型训练、推理、评测、精调等全方位功能与服务,并重点支撑大模型生态。 火山方舟通过稳定可靠的安全互信方案,保障模型提供方的模型安全与模型使用者的信息安全,加速大模型能力渗透到千行百业,助力模型提供方和使用者实现商业新增长。|# 火山方舟大模型服务平台结构化信息|\
|
||
|```||\
|
||
| |## 一、基本属性|\
|
||
| |1. **推出主体**:火山引擎|\
|
||
| |2. **平台定位**:大模型服务平台|\
|
||
| ||\
|
||
| |## 二、核心功能与服务|\
|
||
| |提供模型训练、推理、评测、精调等全方位大模型相关功能与服务,核心定位包括:|\
|
||
| |1. 支撑大模型生态建设|\
|
||
| |2. 推动大模型能力向千行百业渗透|\
|
||
| ||\
|
||
| |## 三、安全保障方案|\
|
||
| |通过稳定可靠的安全互信方案,双向保障信息安全:|\
|
||
| |1. 模型提供方:保障其模型安全|\
|
||
| |2. 模型使用者:保障其信息安全|\
|
||
| ||\
|
||
| |## 四、价值目标|\
|
||
| |助力模型提供方和使用者实现商业新增长,构建大模型生态下的商业共赢模式。|\
|
||
| |```|\
|
||
| | |
|
||
|
||
|
||
```mixin-react
|
||
return (<Tabs>
|
||
<Tabs.TabPane title="Curl" key="vJUYohZMmf"><RenderMd content={`\`\`\`Bash
|
||
curl https://ark.cn-beijing.volces.com/api/v3/chat/completions \\
|
||
-H "Content-Type: application/json" \\
|
||
-H "Authorization: Bearer $ARK_API_KEY" \\
|
||
-d '{
|
||
"model": "doubao-seed-1-6-251015",
|
||
"messages": [
|
||
{"role": "user", "content": "请将下面内容进行结构化处理:火山方舟是火山引擎推出的大模型服务平台,提供模型训练、推理、评测、精调等全方位功能与服务,并重点支撑大模型生态。 火山方舟通过稳定可靠的安全互信方案,保障模型提供方的模型安全与模型使用者的信息安全,加速大模型能力渗透到千行百业,助力模型提供方和使用者实现商业新增长。"}
|
||
],
|
||
"thinking":{
|
||
"type":"disabled"
|
||
}
|
||
}'
|
||
\`\`\`
|
||
|
||
|
||
* 按需替换 Model ID,查询 Model ID 请参见 [模型列表](/docs/82379/1330310)。
|
||
`}></RenderMd></Tabs.TabPane>
|
||
<Tabs.TabPane title="Python" key="RyneFpLl5G"><RenderMd content={`\`\`\`Python
|
||
import os
|
||
# Install SDK: pip install 'volcengine-python-sdk[ark]'
|
||
from volcenginesdkarkruntime import Ark
|
||
|
||
# 初始化Ark客户端
|
||
client = Ark(
|
||
# The base URL for model invocation
|
||
base_url="https://ark.cn-beijing.volces.com/api/v3",
|
||
# Get API Key:https://console.volcengine.com/ark/region:ark+cn-beijing/apikey
|
||
api_key=os.getenv('ARK_API_KEY'),
|
||
)
|
||
|
||
completion = client.chat.completions.create(
|
||
# Replace with Model ID
|
||
model = "doubao-seed-1-6-251015",
|
||
messages=[
|
||
{"role": "user", "content": "请将下面内容进行结构化处理:火山方舟是火山引擎推出的大模型服务平台,提供模型训练、推理、评测、精调等全方位功能与服务,并重点支撑大模型生态。 火山方舟通过稳定可靠的安全互信方案,保障模型提供方的模型安全与模型使用者的信息安全,加速大模型能力渗透到千行百业,助力模型提供方和使用者实现商业新增长。"},
|
||
],
|
||
# thinking={"type": "disabled"}, # Manually disable deep thinking
|
||
)
|
||
print(completion.choices[0].message.content)
|
||
\`\`\`
|
||
|
||
`}></RenderMd></Tabs.TabPane>
|
||
<Tabs.TabPane title="Go" key="vzIAhWeZi9"><RenderMd content={`\`\`\`Go
|
||
package main
|
||
|
||
import (
|
||
"context"
|
||
"fmt"
|
||
"os"
|
||
"github.com/volcengine/volcengine-go-sdk/service/arkruntime"
|
||
"github.com/volcengine/volcengine-go-sdk/service/arkruntime/model"
|
||
"github.com/volcengine/volcengine-go-sdk/volcengine"
|
||
)
|
||
|
||
func main() {
|
||
client := arkruntime.NewClientWithApiKey(
|
||
os.Getenv("ARK_API_KEY"),
|
||
// The base URL for model invocation
|
||
arkruntime.WithBaseUrl("https://ark.cn-beijing.volces.com/api/v3"),
|
||
)
|
||
|
||
ctx := context.Background()
|
||
req := model.CreateChatCompletionRequest{
|
||
// Replace with Model ID
|
||
Model: "doubao-seed-1-6-251015",
|
||
Messages: []*model.ChatCompletionMessage{
|
||
{
|
||
Role: model.ChatMessageRoleUser,
|
||
Content: &model.ChatCompletionMessageContent{
|
||
StringValue: volcengine.String("请将下面内容进行结构化处理:火山方舟是火山引擎推出的大模型服务平台,提供模型训练、推理、评测、精调等全方位功能与服务,并重点支撑大模型生态。 火山方舟通过稳定可靠的安全互信方案,保障模型提供方的模型安全与模型使用者的信息安全,加速大模型能力渗透到千行百业,助力模型提供方和使用者实现商业新增长。"),
|
||
},
|
||
},
|
||
},
|
||
Thinking: &model.Thinking{
|
||
Type: model.ThinkingTypeDisabled, // Manually disable deep thinking
|
||
// Type: model.ThinkingTypeEnabled, // Manually enable deep thinking
|
||
},
|
||
}
|
||
|
||
resp, err := client.CreateChatCompletion(ctx, req)
|
||
if err != nil {
|
||
fmt.Printf("standard chat error: %v\\n", err)
|
||
return
|
||
}
|
||
fmt.Println(*resp.Choices[0].Message.Content.StringValue)
|
||
}
|
||
\`\`\`
|
||
|
||
`}></RenderMd></Tabs.TabPane>
|
||
<Tabs.TabPane title="Java" key="Hijm4ptRjM"><RenderMd content={`\`\`\`java
|
||
package com.ark.sample;
|
||
|
||
import com.volcengine.ark.runtime.model.completion.chat.*;
|
||
import com.volcengine.ark.runtime.service.ArkService;
|
||
import java.util.ArrayList;
|
||
import java.util.List;
|
||
|
||
public class ChatCompletionsExample {
|
||
public static void main(String[] args) {
|
||
String apiKey = System.getenv("ARK_API_KEY");
|
||
// The base URL for model invocation
|
||
ArkService service = ArkService.builder().apiKey(apiKey).baseUrl("https://ark.cn-beijing.volces.com/api/v3").build();
|
||
final List<ChatMessage> messages = new ArrayList<>();
|
||
final ChatMessage userMessage = ChatMessage.builder().role(ChatMessageRole.USER).content("请将下面内容进行结构化处理:火山方舟是火山引擎推出的大模型服务平台,提供模型训练、推理、评测、精调等全方位功能与服务,并重点支撑大模型生态。 火山方舟通过稳定可靠的安全互信方案,保障模型提供方的模型安全与模型使用者的信息安全,加速大模型能力渗透到千行百业,助力模型提供方和使用者实现商业新增长。").build();
|
||
messages.add(userMessage);
|
||
|
||
ChatCompletionRequest chatCompletionRequest = ChatCompletionRequest.builder()
|
||
.model("doubao-seed-1-6-251015")//Replace with Model ID
|
||
.messages(messages)
|
||
// .thinking(new ChatCompletionRequest.ChatCompletionRequestThinking("disabled")) // Manually disable deep thinking
|
||
.build();
|
||
service.createChatCompletion(chatCompletionRequest).getChoices().forEach(choice -> System.out.println(choice.getMessage().getContent()));
|
||
// shutdown service
|
||
service.shutdownExecutor();
|
||
}
|
||
}
|
||
\`\`\`
|
||
|
||
`}></RenderMd></Tabs.TabPane></Tabs>);
|
||
```
|
||
|
||
:::tip
|
||
使用 Responses API 实现单轮对话的示例,请参见[快速开始](/docs/82379/1958520#17377051)。
|
||
:::
|
||
<span id="3e5edc90"></span>
|
||
# 模型与API
|
||
支持的模型:[文本生成能力](/docs/82379/1330310#b318deb2)
|
||
支持的API :
|
||
|
||
* [Responses API](https://www.volcengine.com/docs/82379/1569618):新推出的 API,简洁上下文管理,增强工具调用能力,缓存能力降低成本,新业务及用户推荐。
|
||
* [Chat API](https://www.volcengine.com/docs/82379/1494384):使用广泛的 API,存量业务迁移成本低。
|
||
|
||
<span id="1d866118"></span>
|
||
# 使用示例
|
||
<span id="f6222fec"></span>
|
||
## 多轮对话
|
||
实现多轮对话,需将包含系统消息、模型消息和用户消息的对话历史组合成一个列表,以便模型理解上下文,并延续之前的话题进行问答。
|
||
|
||
<span aceTableMode="list" aceTableWidth="1,5,5"></span>
|
||
|传入方式 |手动管理上下文 |通过ID管理上下文 |
|
||
|---|---|---|
|
||
|使用示例 |```JSON|```JSON|\
|
||
| |...|...|\
|
||
| | "model": "doubao-seed-1-6-251015",| "model": "doubao-seed-1-6-251015",|\
|
||
| | "messages":[| "previous_response_id":"<id>",|\
|
||
| | {"role": "user", "content": "Hi, tell a joke."},| "input": "What is the punchline of this joke?"|\
|
||
| | {"role": "assistant", "content": "Why did the math book look sad? Because it had too many problems! 😄"},|...|\
|
||
| | {"role": "user", "content": "What's the punchline of this joke?"}|```|\
|
||
| | ]| |\
|
||
| |...| |\
|
||
| |```| |\
|
||
| | | |
|
||
|API |[Chat API](https://www.volcengine.com/docs/82379/1494384) |[Responses API](https://www.volcengine.com/docs/82379/1569618) |
|
||
|
||
> 更多说明及完整示例请参见 [上下文管理](/docs/82379/2123288)。
|
||
|
||
<span id="78d5cc11"></span>
|
||
## 流式输出
|
||
|
||
<span aceTableMode="list" aceTableWidth="2,1"></span>
|
||
|预览 |优势 |
|
||
|---|---|
|
||
|<video src="https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/0b0ed47ec1b94b20a4f4966aa80130e6~tplv-goo7wpa0wc-image.image" controls></video>|* **改善等待体验**:无需等待完整内容生成完毕,可立即处理过程内容。|\
|
||
| |* **实时过程反馈**:多轮交互场景,实时了解任务当前的处理阶段。|\
|
||
| |* **更高的容错性**:中途出错,也能获取到已生成内容,避免非流式输出失败无返回的情况。|\
|
||
| |* **简化超时管理**:保持客户端与服务端的连接状态,避免复杂任务耗时过长而连接超时。 |
|
||
|
||
通过配置 **stream** 为 `true`,来启用流式输出。
|
||
```JSON
|
||
...
|
||
"model": "doubao-seed-1-6-251015",
|
||
"messages": [
|
||
{"role": "user", "content": "深度思考模型与非深度思考模型区别"}
|
||
],
|
||
"stream": true
|
||
...
|
||
```
|
||
|
||
> 完整示例及更多说明请参见 [流式输出](/docs/82379/2123275)。
|
||
|
||
<span id="3821b26a"></span>
|
||
## 设置最大回答
|
||
当控制成本或者回答问题时间,可通过限制模型回答长度实现。当回答篇幅较长,如翻译长文本,避免中途截断,可通过设置`max_tokens`更大值实现。
|
||
```JSON
|
||
...
|
||
"model": "doubao-seed-1-6-251015",
|
||
"messages": [
|
||
{"role": "user","content": "What are some common cruciferous plants?"}
|
||
],
|
||
"max_tokens": 300
|
||
...
|
||
```
|
||
|
||
> 完整示例代码,请参见 [控制回答长度](/docs/82379/2123288#c7fbdbe3)。
|
||
|
||
<span id="8783d86f"></span>
|
||
## 异步输出
|
||
当任务较为复杂或者多个任务并发等场景下,可使用 Asyncio 接口实现并发调用,提高程序的效率,优化体验。
|
||
|
||
* Chat API 代码示例:
|
||
|
||
|
||
```mixin-react
|
||
return (<Tabs>
|
||
<Tabs.TabPane title="Python" key="nQ7vQXOOFE"><RenderMd content={`\`\`\`Python
|
||
import asyncio
|
||
import os
|
||
# Install SDK: pip install 'volcengine-python-sdk[ark]'
|
||
from volcenginesdkarkruntime import AsyncArk
|
||
|
||
# 初始化Ark客户端
|
||
client = AsyncArk(
|
||
# The base URL for model invocation
|
||
base_url="https://ark.cn-beijing.volces.com/api/v3",
|
||
# Get API Key:https://console.volcengine.com/ark/region:ark+cn-beijing/apikey
|
||
api_key=os.getenv('ARK_API_KEY'),
|
||
)
|
||
|
||
async def main() -> None:
|
||
stream = await client.chat.completions.create(
|
||
# Replace with Model ID
|
||
model = "doubao-seed-1-6-251015",
|
||
messages=[
|
||
{"role": "system", "content": "你是 AI 人工智能助手"},
|
||
{"role": "user", "content": "常见的十字花科植物有哪些?"},
|
||
],
|
||
stream=True
|
||
)
|
||
async for completion in stream:
|
||
print(completion.choices[0].delta.content, end="")
|
||
print()
|
||
|
||
if __name__ == "__main__":
|
||
asyncio.run(main())
|
||
\`\`\`
|
||
|
||
`}></RenderMd></Tabs.TabPane></Tabs>);
|
||
```
|
||
|
||
|
||
* Responses API 代码示例:
|
||
|
||
|
||
```mixin-react
|
||
return (<Tabs>
|
||
<Tabs.TabPane title="Python" key="ileVGr66Xy"><RenderMd content={`\`\`\`Python
|
||
import asyncio
|
||
import os
|
||
from volcenginesdkarkruntime import Ark
|
||
from volcenginesdkarkruntime.types.responses.response_completed_event import ResponseCompletedEvent
|
||
from volcenginesdkarkruntime.types.responses.response_reasoning_summary_text_delta_event import ResponseReasoningSummaryTextDeltaEvent
|
||
from volcenginesdkarkruntime.types.responses.response_output_item_added_event import ResponseOutputItemAddedEvent
|
||
from volcenginesdkarkruntime.types.responses.response_text_delta_event import ResponseTextDeltaEvent
|
||
from volcenginesdkarkruntime.types.responses.response_text_done_event import ResponseTextDoneEvent
|
||
|
||
|
||
client = AsyncArk(
|
||
base_url='https://ark.cn-beijing.volces.com/api/v3',
|
||
api_key=os.getenv('ARK_API_KEY')
|
||
)
|
||
|
||
async def main():
|
||
stream = await client.responses.create(
|
||
model="doubao-seed-1-6-251015",
|
||
input=[
|
||
{"role": "system", "content": "你是 AI 人工智能助手"},
|
||
{"role": "user", "content": "常见的十字花科植物有哪些?"},
|
||
],
|
||
stream=True
|
||
)
|
||
async for event in stream:
|
||
if isinstance(event, ResponseReasoningSummaryTextDeltaEvent):
|
||
print(event.delta, end="")
|
||
if isinstance(event, ResponseOutputItemAddedEvent):
|
||
print("\\noutPutItem " + event.type + " start:")
|
||
if isinstance(event, ResponseTextDeltaEvent):
|
||
print(event.delta,end="")
|
||
if isinstance(event, ResponseTextDoneEvent):
|
||
print("\\noutPutTextDone.")
|
||
if isinstance(event, ResponseCompletedEvent):
|
||
print("Response Completed. Usage = " + event.response.usage.model_dump_json())
|
||
|
||
|
||
if __name__ == "__main__":
|
||
asyncio.run(main())
|
||
\`\`\`
|
||
|
||
`}></RenderMd></Tabs.TabPane></Tabs>);
|
||
```
|
||
|
||
<span id="10b8a01c"></span>
|
||
# 更多使用
|
||
<span id="a1d6b42a"></span>
|
||
## 深度思考
|
||
模型在输出回答前,先对输入问题进行系统性分析与逻辑拆解,再基于拆解结果生成回答。
|
||
可以显著提升回复质量,但会增加 token 消耗,详细信息请参见[深度思考](/docs/82379/1449737)。
|
||
<span id="19b5e705"></span>
|
||
## 提示词工程
|
||
正确设计和编写提示词,如提供说明、示例、好的规范等方法可提高模型输出的质量和准确性。进行提示词优化的工作也被称为提示词工程(Prompt Engineering)。详细信息请参见[提示词工程](/docs/82379/1221660)。
|
||
<span id="39a7195c"></span>
|
||
## 工具调用
|
||
通过集成内置工具或连接远程 MCP 服务器,您可以扩展模型的功能,以便更好回答问题或执行任务。当前支持:
|
||
|
||
* 内置工具:搜索网络、检索数据、图片处理等。
|
||
* 调用自定义函数。
|
||
* 访问三方MCP服务。
|
||
|
||
详细信息请参见[工具概述](/docs/82379/1827538)。
|
||
<span id="8d0362b6"></span>
|
||
## 续写模式
|
||
通过预填(Prefill)部分 **assistant** 角色的内容,引导和控制模型从已有的文本片段继续输出,以及控制模型在角色扮演场景中保持一致性。
|
||
|
||
* [续写模式 Prefill Response](/docs/82379/1359497):使用[Chat API](https://www.volcengine.com/docs/82379/1494384)实现续写模式。
|
||
* [续写模式](/docs/82379/1958520#a1384090):使用[Responses API](https://www.volcengine.com/docs/82379/1569618)实现续写模式。
|
||
|
||
<span id="c22bed1a"></span>
|
||
## 结构化输出(beta)
|
||
控制模型输出程序可处理的标准格式(主要是 JSON)而非自然语言,方便标准化处理或展示。
|
||
|
||
* [结构化输出(beta)](/docs/82379/1568221):使用[Chat API](https://www.volcengine.com/docs/82379/1494384)实现结构化输出。
|
||
* [结构化输出(beta)](/docs/82379/1568221):使用[Responses API](https://www.volcengine.com/docs/82379/1569618)实现结构化输出。
|
||
|
||
<span id="4f8038b1"></span>
|
||
## 批量推理
|
||
方舟为您提供批量推理的能力,当您有大批量数据处理任务,可使用批量推理能力,以获得更大吞吐量和更低的成本。详细介绍和使用,请参见 [批量推理](/docs/82379/1399517)。
|
||
<span id="3b458a44"></span>
|
||
## 异常处理
|
||
增加异常处理,帮助定位问题。
|
||
|
||
```mixin-react
|
||
return (<Tabs>
|
||
<Tabs.TabPane title="Python" key="ylMJa5FOjw"><RenderMd content={`\`\`\`Python
|
||
import os
|
||
# Install SDK: pip install 'volcengine-python-sdk[ark]'
|
||
from volcenginesdkarkruntime import Ark
|
||
from volcenginesdkarkruntime._exceptions import ArkAPIError
|
||
|
||
# 初始化Ark客户端
|
||
client = Ark(
|
||
# The base URL for model invocation
|
||
base_url="https://ark.cn-beijing.volces.com/api/v3",
|
||
api_key=os.getenv('ARK_API_KEY'),
|
||
)
|
||
|
||
# Streaming
|
||
try:
|
||
stream = client.chat.completions.create(
|
||
# Replace with Model ID
|
||
model = "doubao-seed-1-6-251015",
|
||
messages=[
|
||
{"role": "system", "content": "你是 AI 人工智能助手"},
|
||
{"role": "user", "content": "常见的十字花科植物有哪些?"},
|
||
],
|
||
stream=True
|
||
)
|
||
for chunk in stream:
|
||
if not chunk.choices:
|
||
continue
|
||
|
||
print(chunk.choices[0].delta.content, end="")
|
||
print()
|
||
except ArkAPIError as e:
|
||
print(e)
|
||
\`\`\`
|
||
|
||
`}></RenderMd></Tabs.TabPane>
|
||
<Tabs.TabPane title="Go" key="eZMbcdcous"><RenderMd content={`\`\`\`Go
|
||
package main
|
||
|
||
import (
|
||
"context"
|
||
"errors"
|
||
"fmt"
|
||
"io"
|
||
"os"
|
||
"github.com/volcengine/volcengine-go-sdk/service/arkruntime"
|
||
"github.com/volcengine/volcengine-go-sdk/service/arkruntime/model"
|
||
"github.com/volcengine/volcengine-go-sdk/volcengine"
|
||
)
|
||
|
||
func main() {
|
||
client := arkruntime.NewClientWithApiKey(
|
||
os.Getenv("ARK_API_KEY"),
|
||
// The base URL for model invocation
|
||
arkruntime.WithBaseUrl("https://ark.cn-beijing.volces.com/api/v3"),
|
||
)
|
||
ctx := context.Background()
|
||
|
||
fmt.Println("----- streaming request -----")
|
||
req := model.CreateChatCompletionRequest{
|
||
// Replace with Model ID
|
||
Model: "doubao-seed-1-6-251015",
|
||
Messages: []*model.ChatCompletionMessage{
|
||
{
|
||
Role: model.ChatMessageRoleSystem,
|
||
Content: &model.ChatCompletionMessageContent{
|
||
StringValue: volcengine.String("你是 AI 人工智能助手"),
|
||
},
|
||
},
|
||
{
|
||
Role: model.ChatMessageRoleUser,
|
||
Content: &model.ChatCompletionMessageContent{
|
||
StringValue: volcengine.String("常见的十字花科植物有哪些?"),
|
||
},
|
||
},
|
||
},
|
||
}
|
||
stream, err := client.CreateChatCompletionStream(ctx, req)
|
||
if err != nil {
|
||
apiErr := &model.APIError{}
|
||
if errors.As(err, &apiErr) {
|
||
fmt.Printf("stream chat error: %v\\n", apiErr)
|
||
}
|
||
return
|
||
}
|
||
defer stream.Close()
|
||
|
||
for {
|
||
recv, err := stream.Recv()
|
||
if err == io.EOF {
|
||
return
|
||
}
|
||
if err != nil {
|
||
apiErr := &model.APIError{}
|
||
if errors.As(err, &apiErr) {
|
||
fmt.Printf("stream chat error: %v\\n", apiErr)
|
||
}
|
||
return
|
||
}
|
||
|
||
if len(recv.Choices) > 0 {
|
||
fmt.Print(recv.Choices[0].Delta.Content)
|
||
}
|
||
}
|
||
}
|
||
\`\`\`
|
||
|
||
`}></RenderMd></Tabs.TabPane>
|
||
<Tabs.TabPane title="Java" key="CZAaXNryKC"><RenderMd content={`\`\`\`java
|
||
package com.volcengine.ark.runtime;
|
||
|
||
import com.volcengine.ark.runtime.exception.ArkHttpException;
|
||
import com.volcengine.ark.runtime.model.completion.chat.ChatCompletionRequest;
|
||
import com.volcengine.ark.runtime.model.completion.chat.ChatMessage;
|
||
import com.volcengine.ark.runtime.model.completion.chat.ChatMessageRole;
|
||
import com.volcengine.ark.runtime.service.ArkService;
|
||
import java.util.ArrayList;
|
||
import java.util.List;
|
||
|
||
|
||
public class ChatCompletionsExample {
|
||
public static void main(String[] args) {
|
||
|
||
String apiKey = System.getenv("ARK_API_KEY");
|
||
// The base URL for model invocation
|
||
ArkService service = ArkService.builder().apiKey(apiKey).baseUrl("https://ark.cn-beijing.volces.com/api/v3").build();
|
||
|
||
System.out.println("----- streaming request -----");
|
||
final List<ChatMessage> streamMessages = new ArrayList<>();
|
||
final ChatMessage streamSystemMessage = ChatMessage.builder().role(ChatMessageRole.SYSTEM).content("你是 AI 人工智能助手").build();
|
||
final ChatMessage streamUserMessage = ChatMessage.builder().role(ChatMessageRole.USER).content("常见的十字花科植物有哪些?").build();
|
||
streamMessages.add(streamSystemMessage);
|
||
streamMessages.add(streamUserMessage);
|
||
|
||
ChatCompletionRequest streamChatCompletionRequest = ChatCompletionRequest.builder()
|
||
.model("doubao-seed-1-6-251015")//Replace with Model ID
|
||
.messages(streamMessages)
|
||
.build();
|
||
|
||
try {
|
||
service.streamChatCompletion(streamChatCompletionRequest)
|
||
.doOnError(Throwable::printStackTrace)
|
||
.blockingForEach(
|
||
choice -> {
|
||
if (choice.getChoices().size() > 0) {
|
||
System.out.print(choice.getChoices().get(0).getMessage().getContent());
|
||
}
|
||
}
|
||
);
|
||
} catch (ArkHttpException e) {
|
||
System.out.print(e.toString());
|
||
}
|
||
|
||
// shutdown service
|
||
service.shutdownExecutor();
|
||
}
|
||
|
||
}
|
||
\`\`\`
|
||
|
||
`}></RenderMd></Tabs.TabPane></Tabs>);
|
||
```
|
||
|
||
<span id="b411f06e"></span>
|
||
## 对话加密
|
||
除了默认的网络层加密,火山方舟还提供免费的应用层加密功能,为您的推理会话数据提供更强的安全保护。您只需增加一行代码即可启用。完整示例代码请参见 [加密数据](/docs/82379/1544136#23274b89);更多原理信息,请参见[推理会话数据应用层加密方案](/docs/82379/1389905)。
|
||
<span id="ca2551d7"></span>
|
||
# 使用说明
|
||
|
||
* 模型关键限制:
|
||
* 最大上下文长度(Context Window):即单次请求模型能处理的内容长度,包括用户输入和模型输出,单位 token 。超出最大上下文长度的内容时,会截断并停止输出。如碰到上下文限制导致的内容截断,可选择支持更大上下文长度规格的模型。
|
||
* 最大输出长度(Max Tokens):即单次模型输出的内容的最大长度。如碰到这种情况,可参考[续写模式 Prefill Response](/docs/82379/1359497),通过多次续写回复,拼接出完整内容。
|
||
* 每分钟处理内容量(TPM):即账号下同模型(不区分版本)每分钟能处理的内容量限制,单位 token。如默认 TPM 限制无法满足您的业务,可通过[工单](https://console.volcengine.com/workorder/create?step=2&SubProductID=P00001166)联系售后提升配额。举例:某模型的 TPM 为 500w,一个主账号下创建的该模型的所有版本接入点共享此配额。
|
||
* 每分钟处理请求数(RPM):即账号下同模型(不区分版本)每分钟能处理的请求数上限,与上面 TPM 类似。如默认 RPM 限制无法满足您的业务,可通过[工单](https://console.volcengine.com/workorder/create?step=2&SubProductID=P00001166)联系售后提升配额。
|
||
* 各模型详细的规格信息,请参见 [模型列表](/docs/82379/1330310)。
|
||
* 用量查询:
|
||
* 对于某次请求 token 用量:可在返回的 **usage** 结构体中查看。
|
||
* 输入/输出内容的 token 用量:可使用 [Tokenization API](https://www.volcengine.com/docs/82379/1528728) 或 [Token 计算器](https://console.volcengine.com/ark/region:ark+cn-beijing/tokenCalculator)来估算。
|
||
* 账号/项目/接入点维度 token 用量:可在 [用量统计](https://console.volcengine.com/ark/region:ark+cn-beijing/usageTracking) 页面查看。
|
||
|
||
<span id="901dd971"></span>
|
||
# 常见问题
|
||
[常见问题](/docs/82379/1359411)\-[在线推理](/docs/82379/1359411#aa45e6c0):在线推理的常见问题,如遇到错误,可尝试在这里找解决方案。
|
||
|
||
|