diff --git a/API相关/离在线语音合成SDK概览.md b/API相关/离在线语音合成SDK概览.md new file mode 100644 index 0000000..feecf52 --- /dev/null +++ b/API相关/离在线语音合成SDK概览.md @@ -0,0 +1,45 @@ +本文档对语音合成SDK支持的能力进行说明。 + +* **SDK名称**:语音合成SDK +* **SDK开发者**:北京火山引擎科技有限公司 +* **主要功能**:语音合成SDK支持将文字实时合成语音,适用于实时语音播报的场景,如有声阅读、导航、语音助手等等。 + + +## SDK接入 + +| | | | \ +|平台/语言 |集成指南 |调用流程 | +|---|---|---| +| | | | \ +|Android |[集成指南](/docs/6561/79832) |[调用流程](/docs/6561/79834) | +| | | | \ +|iOS |[集成指南](/docs/6561/79835) |[调用流程](/docs/6561/79837) | + +**其他相关信息**: + +* [SDK版本信息](/docs/6561/79830) +* [SDK隐私政策](/docs/6561/116696) +* [开发者使用合规规范](/docs/6561/116711) + + +# 合成能力 +**在线合成**:云端合成,发起网络请求,边合成边播放(支持TTS的websocket接口,能够使用声音复刻音色以及TTS大小模型音色) +**离线合成**:本地离线引擎合成,需要相关资源文件,边合成边播放; + +# 合成策略 +离在线语音合成SDK,除了可以单独使用的在线合成及离线合成外,提供了在线合成发生网络超时后,切换离线合成的两种策略,用户可以通过配置建连超时和接收超时两个参数来控制切换的敏感程度。 + +* **在线优先**:优先发起在线合成,失败后(网络超时),启动离线合成引擎开始合成; +* **并发合成**:同时发起在线合成与离线合成,在线请求失败的情况下,使用离线合成数据,该模式下,可以配置更短的超时时间以提升效果,但会消耗更多系统性能; + + +# 合成场景 +语音合成SDK提供了两种种合成场景,以满足不同的需求: + +* **普通场景**:又称单句场景,引擎每次启动,只合成、播放一句音频的模式。 +* **小说场景**:适用于听书业务,每次启动引擎后可以根据需求合成多句音频。 + + +# 合成效果 +通过对发音人、音调、音量和语速等参数的调整,可以获得不同的发声效果,更好满足您业务场景中的播报需求。 + diff --git a/API相关/豆包大模型-故事生成.md b/API相关/豆包大模型-故事生成.md new file mode 100644 index 0000000..d3f92ea --- /dev/null +++ b/API相关/豆包大模型-故事生成.md @@ -0,0 +1,1039 @@ +数分钟内完成你的首次 API 调用。 + + + + + + + +**体验中心** +“0”代码,交互式体验模型能力 + + + + + + + + + + + + +**业务迁移** +兼容OpenAI API,快速迁移业务至方舟 + + + + + + + + + + +# 1 获取并配置 API Key + +1. 获取 API Key:访问[API Key 管理](https://console.volcengine.com/ark/region:ark+cn-beijing/apiKey) ,创建你的 API Key。 +2. 配置环境变量:在终端中运行下面命令,配置 API Key 到环境变量。 +> 配置持久化环境变量方法参见 [环境变量配置指南](/docs/82379/1820161)。 + + +```mixin-react +return ( + + + +); +``` + + + +# 2 开通模型服务 +访问 [开通管理页面](https://console.volcengine.com/ark/region:ark+cn-beijing/openManagement) 开通模型服务。 + +# 3 安装 SDK +安装官方或三方 SDK。 + +```mixin-react +return ( + 运行环境中需安装 [Python](https://www.python.org/downloads/) 版本 3.7 或以上。 + +* 安装方舟 SDK: + \`\`\`Bash + pip install 'volcengine-python-sdk[ark]' + \`\`\` + +* 安装 OpenAI SDK: + \`\`\`Bash + pip install openai + \`\`\` + +`}> + 环境中安装 [Go](https://golang.google.cn/doc/install) 版本 1.18 或以上。 + +在代码中通过下方方法引入 Go SDK +\`\`\`Go +import ( + "github.com/volcengine/volcengine-go-sdk" +) +\`\`\` + +`}> + 环境中安装 [Java](https://www.java.com/en/download/help/index_installing.html) 版本 1.8 或以上。 + +在项目的\`pom.xml\`文件中添加以下依赖配置。 +\`\`\`XML + + com.volcengine + volcengine-java-sdk-ark-runtime + LATEST + +\`\`\` + +`}>); +``` + + +# 4 发起 API 请求 + +## 文本生成 +传入文本类信息给模型,进行问答、分析、改写、摘要、编程、翻译等任务,并返回文本结果。 + +```mixin-react +return ( + + + + +); +``` + + +* [文本生成](/docs/82379/1399009):文本生成使用指南。 +* [深度思考](/docs/82379/1956279):深度思考能力使用指南。 +* [迁移至 Responses API](/docs/82379/1585128):新用户推荐,更简洁的上下文管理能力、强大的工具调用能力。 +* [Chat API](https://www.volcengine.com/docs/82379/1494384):存量业务迭代推荐,广泛使用的 API。 + + +## 多模态理解 +传入图片、视频、PDF文件给模型,进行分析、内容审核、问答、视觉定位等基于多模态理解相关任务,并返回文本结果。 + + +|输入 |输出预览 | +|---|---| +|![图片](https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/a31c2edfbe844461a43f5e8f74fbcce4~tplv-goo7wpa0wc-image.image =275x) |* 思考:用户现在需要找支持输入图片的模型系列,看表格里的输入列中的图像列,哪个模型对应的图像输入是√。看表格,Doubao\-1.5\-vision那一行的输入图像列是√,其他两个Doubao\-1.5\-pro和lite的输入图像都是×,所以答案是Doubao\-1.5\-vision。|\ +|> 支持输入图片的模型系列是哪个? |* 回答:支持输入图片的模型系列是Doubao\-1.5\-vision | + + +```mixin-react +return ( + + + + +); +``` + + +* [多模态理解](/docs/82379/1958521):多模态理解详细使用指南。 +* [视觉定位 Grounding](/docs/82379/1616136):图片中找到对应目标并返回坐标任务。 +* [GUI 任务处理](/docs/82379/1584296):在计算机/移动设备中完成自动化任务。 +* [文件输入(File API)](/docs/82379/1885708):传入图片、视频、文档接口。 + + +## 图片生成 +传入图片、文字给模型,进行:广告、海报、组图等图片生成;增改元素、颜色更换等图片编辑;油墨、水墨等风格切换。 + + +|提示词 |输出预览 | +|---|---| +|充满活力的特写编辑肖像,模特眼神犀利,头戴雕塑感帽子,色彩拼接丰富,眼部焦点锐利,景深较浅,具有Vogue杂志封面的美学风格,采用中画幅拍摄,工作室灯光效果强烈。 |![图片](https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/00fb66006eb84b16965b620b6e1f2d78~tplv-goo7wpa0wc-image.image =275x) | + + +```mixin-react +return ( + + + + +); +``` + + +* [Seedream 4.0-4.5 教程](/docs/82379/1824121):主流生图模型能力以及如何通过 API 调用。 +* [Seedream 4.0-4.5 提示词指南](/docs/82379/1829186):使用生图模型时,如何编写提示词。 + + +## 视频生成 +通过文本描述、图像素材,快速生成高质量、风格多样的视频内容。 + + +|提示词 |输出画面预览 | +|---|---| +|一位身穿绿色亮片礼服的女性站在粉红色背景前,周围飘落着五彩斑斓的彩纸 |![图片](https://p9-arcosite.byteimg.com/tos-cn-i-goo7wpa0wc/aae3d0c636954bdd9e66e7a23e98c480~tplv-goo7wpa0wc-image.image =275x) | + + +```mixin-react +return ( + + contents = new ArrayList<>(); + contents.add(Content.builder() + .type("text") + .text("一位身穿绿色亮片礼服的女性站在粉红色背景前,周围飘落着五彩斑斓的彩纸 --wm true --dur 5") + .build()); + + // Create a video generation task + CreateContentGenerationTaskRequest createRequest = CreateContentGenerationTaskRequest.builder() + .model("doubao-seedance-1-0-pro-250528") // Replace with Model ID + .content(contents) + .build(); + + CreateContentGenerationTaskResult createResult = service.createContentGenerationTask(createRequest); + System.out.println(createResult); + + // Get the details of the task + String taskId = createResult.getId(); + GetContentGenerationTaskRequest getRequest = GetContentGenerationTaskRequest.builder() + .taskId(taskId) + .build(); + + System.out.println("----- polling task status -----"); + while (true) { + try { + GetContentGenerationTaskResponse getResponse = service.getContentGenerationTask(getRequest); + String status = getResponse.getStatus(); + if ("succeeded".equalsIgnoreCase(status)) { + System.out.println("----- task succeeded -----"); + System.out.println(getResponse); + service.shutdownExecutor(); + break; + } else if ("failed".equalsIgnoreCase(status)) { + System.out.println("----- task failed -----"); + System.out.println("Error: " + getResponse.getStatus()); + service.shutdownExecutor(); + break; + } else { + System.out.printf("Current status: %s, Retrying in 3 seconds...\\n", status); + TimeUnit.SECONDS.sleep(3); + } + } catch (InterruptedException ie) { + Thread.currentThread().interrupt(); + System.err.println("Polling interrupted"); + service.shutdownExecutor(); + break; + } + } + } +} +\`\`\` + +`}> +); +``` + + +* [视频生成](/docs/82379/1366799):学习如何使用模型的视频生成能力,包括文本生成视频、首尾帧生视频、首帧生成视频等。 +* [Seedance-1.0-pro&pro-fast 提示词指南](/docs/82379/1631633):使用生视频模型时,如何编写提示词。 + + +## 工具使用 +通过工具/插件让模型具体读取外部数据及函数的能力,包括 + +* 内置工具:联网搜索、图片处理、知识库检索等已集成至方舟平台的工具。 +* 三方工具:兼容MCP 的三方工具。 +* 自定义工具:您自行定义及开发的工具。 + + +```mixin-react +return ( + + + buildTools() { + ToolWebSearch t = ToolWebSearch.builder().build(); + System.out.println(Arrays.asList(t)); + return Arrays.asList(t); + } + + public static void main(String[] args) throws JsonProcessingException { + String apiKey = System.getenv("ARK_API_KEY"); + + ArkService arkService = ArkService.builder().apiKey(apiKey).baseUrl("https://ark.cn-beijing.volces.com/api/v3").build(); + CreateResponsesRequest req = CreateResponsesRequest.builder() + .model("doubao-seed-1-6-251015") + .input(ResponsesInput.builder().addListItem( + ItemEasyMessage.builder().role(ResponsesConstants.MESSAGE_ROLE_USER).content( + MessageContent.builder() + .addListItem(InputContentItemText.builder().text("What's the weather like in Beijing?").build()) + .build() + ).build() + ).build()) + .tools(buildTools()) + .build(); + ResponseObject resp = arkService.createResponse(req); + System.out.println(resp); + + arkService.shutdownExecutor(); + } +} +\`\`\` + +`}> + +); +``` + + +* [工具调用](/docs/82379/1958524):学习如何让模型使用内置工具,如网页搜索、知识库检索、豆包助手等能力。 +* [函数调用 Function Calling](/docs/82379/1262342):学习如何让模型调用自定义的工具。 +* [云部署 MCP / Remote MCP](/docs/82379/1827534):学习如何让模型使用 MCP 服务。 + + +# 5 下一步 +现在你已经完成了首次方舟模型服务的 API 调用,你可以探索模型的更多能力,包括: + +* [平台能力速览](/docs/82379/1108216):探索方舟平台提供的提示词优化、权限管理、模型管理等高阶能力。 +* [模型列表](/docs/82379/1330310):快速浏览方舟提供的模型全集以及各个模型所具备的能力,快速根据你的实际场景匹配到合适的模型。 + + + diff --git a/API相关/豆包大模型-文本生成.md b/API相关/豆包大模型-文本生成.md new file mode 100644 index 0000000..15eb5d4 --- /dev/null +++ b/API相关/豆包大模型-文本生成.md @@ -0,0 +1,551 @@ +根据输入的提示词生成连贯清晰的文本内容。 +:::tip +方舟平台的新用户?获取 API Key 及 开通模型等准备工作,请参见 [快速入门](/docs/82379/1399008)。 +::: + +# 快速开始 + + +|输入 |输出预览 | +|---|---| +|```Plain|```Plain|\ +|请将下面内容进行结构化处理:火山方舟是火山引擎推出的大模型服务平台,提供模型训练、推理、评测、精调等全方位功能与服务,并重点支撑大模型生态。 火山方舟通过稳定可靠的安全互信方案,保障模型提供方的模型安全与模型使用者的信息安全,加速大模型能力渗透到千行百业,助力模型提供方和使用者实现商业新增长。|# 火山方舟大模型服务平台结构化信息|\ +|```||\ +| |## 一、基本属性|\ +| |1. **推出主体**:火山引擎|\ +| |2. **平台定位**:大模型服务平台|\ +| ||\ +| |## 二、核心功能与服务|\ +| |提供模型训练、推理、评测、精调等全方位大模型相关功能与服务,核心定位包括:|\ +| |1. 支撑大模型生态建设|\ +| |2. 推动大模型能力向千行百业渗透|\ +| ||\ +| |## 三、安全保障方案|\ +| |通过稳定可靠的安全互信方案,双向保障信息安全:|\ +| |1. 模型提供方:保障其模型安全|\ +| |2. 模型使用者:保障其信息安全|\ +| ||\ +| |## 四、价值目标|\ +| |助力模型提供方和使用者实现商业新增长,构建大模型生态下的商业共赢模式。|\ +| |```|\ +| | | + + +```mixin-react +return ( + + + + messages = new ArrayList<>(); + final ChatMessage userMessage = ChatMessage.builder().role(ChatMessageRole.USER).content("请将下面内容进行结构化处理:火山方舟是火山引擎推出的大模型服务平台,提供模型训练、推理、评测、精调等全方位功能与服务,并重点支撑大模型生态。 火山方舟通过稳定可靠的安全互信方案,保障模型提供方的模型安全与模型使用者的信息安全,加速大模型能力渗透到千行百业,助力模型提供方和使用者实现商业新增长。").build(); + messages.add(userMessage); + + ChatCompletionRequest chatCompletionRequest = ChatCompletionRequest.builder() + .model("doubao-seed-1-6-251015")//Replace with Model ID + .messages(messages) + // .thinking(new ChatCompletionRequest.ChatCompletionRequestThinking("disabled")) // Manually disable deep thinking + .build(); + service.createChatCompletion(chatCompletionRequest).getChoices().forEach(choice -> System.out.println(choice.getMessage().getContent())); + // shutdown service + service.shutdownExecutor(); + } +} +\`\`\` + +`}>); +``` + +:::tip +使用 Responses API 实现单轮对话的示例,请参见[快速开始](/docs/82379/1958520#17377051)。 +::: + +# 模型与API +支持的模型:[文本生成能力](/docs/82379/1330310#b318deb2) +支持的API : + +* [Responses API](https://www.volcengine.com/docs/82379/1569618):新推出的 API,简洁上下文管理,增强工具调用能力,缓存能力降低成本,新业务及用户推荐。 +* [Chat API](https://www.volcengine.com/docs/82379/1494384):使用广泛的 API,存量业务迁移成本低。 + + +# 使用示例 + +## 多轮对话 +实现多轮对话,需将包含系统消息、模型消息和用户消息的对话历史组合成一个列表,以便模型理解上下文,并延续之前的话题进行问答。 + + +|传入方式 |手动管理上下文 |通过ID管理上下文 | +|---|---|---| +|使用示例 |```JSON|```JSON|\ +| |...|...|\ +| | "model": "doubao-seed-1-6-251015",| "model": "doubao-seed-1-6-251015",|\ +| | "messages":[| "previous_response_id":"",|\ +| | {"role": "user", "content": "Hi, tell a joke."},| "input": "What is the punchline of this joke?"|\ +| | {"role": "assistant", "content": "Why did the math book look sad? Because it had too many problems! 😄"},|...|\ +| | {"role": "user", "content": "What's the punchline of this joke?"}|```|\ +| | ]| |\ +| |...| |\ +| |```| |\ +| | | | +|API |[Chat API](https://www.volcengine.com/docs/82379/1494384) |[Responses API](https://www.volcengine.com/docs/82379/1569618) | + +> 更多说明及完整示例请参见 [上下文管理](/docs/82379/2123288)。 + + +## 流式输出 + + +|预览 |优势 | +|---|---| +||* **改善等待体验**:无需等待完整内容生成完毕,可立即处理过程内容。|\ +| |* **实时过程反馈**:多轮交互场景,实时了解任务当前的处理阶段。|\ +| |* **更高的容错性**:中途出错,也能获取到已生成内容,避免非流式输出失败无返回的情况。|\ +| |* **简化超时管理**:保持客户端与服务端的连接状态,避免复杂任务耗时过长而连接超时。 | + +通过配置 **stream** 为 `true`,来启用流式输出。 +```JSON +... + "model": "doubao-seed-1-6-251015", + "messages": [ + {"role": "user", "content": "深度思考模型与非深度思考模型区别"} + ], + "stream": true + ... +``` + +> 完整示例及更多说明请参见 [流式输出](/docs/82379/2123275)。 + + +## 设置最大回答 +当控制成本或者回答问题时间,可通过限制模型回答长度实现。当回答篇幅较长,如翻译长文本,避免中途截断,可通过设置`max_tokens`更大值实现。 +```JSON +... + "model": "doubao-seed-1-6-251015", + "messages": [ + {"role": "user","content": "What are some common cruciferous plants?"} + ], + "max_tokens": 300 +... +``` + +> 完整示例代码,请参见 [控制回答长度](/docs/82379/2123288#c7fbdbe3)。 + + +## 异步输出 +当任务较为复杂或者多个任务并发等场景下,可使用 Asyncio 接口实现并发调用,提高程序的效率,优化体验。 + +* Chat API 代码示例: + + +```mixin-react +return ( + None: + stream = await client.chat.completions.create( + # Replace with Model ID + model = "doubao-seed-1-6-251015", + messages=[ + {"role": "system", "content": "你是 AI 人工智能助手"}, + {"role": "user", "content": "常见的十字花科植物有哪些?"}, + ], + stream=True + ) + async for completion in stream: + print(completion.choices[0].delta.content, end="") + print() + +if __name__ == "__main__": + asyncio.run(main()) +\`\`\` + +`}>); +``` + + +* Responses API 代码示例: + + +```mixin-react +return ( +); +``` + + +# 更多使用 + +## 深度思考 +模型在输出回答前,先对输入问题进行系统性分析与逻辑拆解,再基于拆解结果生成回答。 +可以显著提升回复质量,但会增加 token 消耗,详细信息请参见[深度思考](/docs/82379/1449737)。 + +## 提示词工程 +正确设计和编写提示词,如提供说明、示例、好的规范等方法可提高模型输出的质量和准确性。进行提示词优化的工作也被称为提示词工程(Prompt Engineering)。详细信息请参见[提示词工程](/docs/82379/1221660)。 + +## 工具调用 +通过集成内置工具或连接远程 MCP 服务器,您可以扩展模型的功能,以便更好回答问题或执行任务。当前支持: + +* 内置工具:搜索网络、检索数据、图片处理等。 +* 调用自定义函数。 +* 访问三方MCP服务。 + +详细信息请参见[工具概述](/docs/82379/1827538)。 + +## 续写模式 +通过预填(Prefill)部分 **assistant** 角色的内容,引导和控制模型从已有的文本片段继续输出,以及控制模型在角色扮演场景中保持一致性。 + +* [续写模式 Prefill Response](/docs/82379/1359497):使用[Chat API](https://www.volcengine.com/docs/82379/1494384)实现续写模式。 +* [续写模式](/docs/82379/1958520#a1384090):使用[Responses API](https://www.volcengine.com/docs/82379/1569618)实现续写模式。 + + +## 结构化输出(beta) +控制模型输出程序可处理的标准格式(主要是 JSON)而非自然语言,方便标准化处理或展示。 + +* [结构化输出(beta)](/docs/82379/1568221):使用[Chat API](https://www.volcengine.com/docs/82379/1494384)实现结构化输出。 +* [结构化输出(beta)](/docs/82379/1568221):使用[Responses API](https://www.volcengine.com/docs/82379/1569618)实现结构化输出。 + + +## 批量推理 +方舟为您提供批量推理的能力,当您有大批量数据处理任务,可使用批量推理能力,以获得更大吞吐量和更低的成本。详细介绍和使用,请参见 [批量推理](/docs/82379/1399517)。 + +## 异常处理 +增加异常处理,帮助定位问题。 + +```mixin-react +return ( + + 0 { + fmt.Print(recv.Choices[0].Delta.Content) + } + } +} +\`\`\` + +`}> + streamMessages = new ArrayList<>(); + final ChatMessage streamSystemMessage = ChatMessage.builder().role(ChatMessageRole.SYSTEM).content("你是 AI 人工智能助手").build(); + final ChatMessage streamUserMessage = ChatMessage.builder().role(ChatMessageRole.USER).content("常见的十字花科植物有哪些?").build(); + streamMessages.add(streamSystemMessage); + streamMessages.add(streamUserMessage); + + ChatCompletionRequest streamChatCompletionRequest = ChatCompletionRequest.builder() + .model("doubao-seed-1-6-251015")//Replace with Model ID + .messages(streamMessages) + .build(); + + try { + service.streamChatCompletion(streamChatCompletionRequest) + .doOnError(Throwable::printStackTrace) + .blockingForEach( + choice -> { + if (choice.getChoices().size() > 0) { + System.out.print(choice.getChoices().get(0).getMessage().getContent()); + } + } + ); + } catch (ArkHttpException e) { + System.out.print(e.toString()); + } + + // shutdown service + service.shutdownExecutor(); + } + +} +\`\`\` + +`}>); +``` + + +## 对话加密 +除了默认的网络层加密,火山方舟还提供免费的应用层加密功能,为您的推理会话数据提供更强的安全保护。您只需增加一行代码即可启用。完整示例代码请参见 [加密数据](/docs/82379/1544136#23274b89);更多原理信息,请参见[推理会话数据应用层加密方案](/docs/82379/1389905)。 + +# 使用说明 + +* 模型关键限制: + * 最大上下文长度(Context Window):即单次请求模型能处理的内容长度,包括用户输入和模型输出,单位 token 。超出最大上下文长度的内容时,会截断并停止输出。如碰到上下文限制导致的内容截断,可选择支持更大上下文长度规格的模型。 + * 最大输出长度(Max Tokens):即单次模型输出的内容的最大长度。如碰到这种情况,可参考[续写模式 Prefill Response](/docs/82379/1359497),通过多次续写回复,拼接出完整内容。 + * 每分钟处理内容量(TPM):即账号下同模型(不区分版本)每分钟能处理的内容量限制,单位 token。如默认 TPM 限制无法满足您的业务,可通过[工单](https://console.volcengine.com/workorder/create?step=2&SubProductID=P00001166)联系售后提升配额。举例:某模型的 TPM 为 500w,一个主账号下创建的该模型的所有版本接入点共享此配额。 + * 每分钟处理请求数(RPM):即账号下同模型(不区分版本)每分钟能处理的请求数上限,与上面 TPM 类似。如默认 RPM 限制无法满足您的业务,可通过[工单](https://console.volcengine.com/workorder/create?step=2&SubProductID=P00001166)联系售后提升配额。 + * 各模型详细的规格信息,请参见 [模型列表](/docs/82379/1330310)。 +* 用量查询: + * 对于某次请求 token 用量:可在返回的 **usage** 结构体中查看。 + * 输入/输出内容的 token 用量:可使用 [Tokenization API](https://www.volcengine.com/docs/82379/1528728) 或 [Token 计算器](https://console.volcengine.com/ark/region:ark+cn-beijing/tokenCalculator)来估算。 + * 账号/项目/接入点维度 token 用量:可在 [用量统计](https://console.volcengine.com/ark/region:ark+cn-beijing/usageTracking) 页面查看。 + + +# 常见问题 +[常见问题](/docs/82379/1359411)\-[在线推理](/docs/82379/1359411#aa45e6c0):在线推理的常见问题,如遇到错误,可尝试在这里找解决方案。 + + diff --git a/API相关/豆包精品长文本语音合成-API接口文档.md b/API相关/豆包精品长文本语音合成-API接口文档.md new file mode 100644 index 0000000..5f2d9fd --- /dev/null +++ b/API相关/豆包精品长文本语音合成-API接口文档.md @@ -0,0 +1,315 @@ + +# 接口说明 +精品长文本语音合成为异步合成服务,提供“创建合成任务”和“查询合成结果”两个接口,也可通过http回调获取合成结果。 +请确认是否可满足业务需求再进行接入,本产品适用于需要批量合成较长文本,且对返回时效性无强需求的场景,单次可支持10万字符以内文本,异步返回音频。对于输入的文本请求,会进入集群排队处理,返回时长会受集群负载影响波动,通常返回时间会在数十分钟,最长返回时延3小时以内。如出现长时间未返回情况,如无报错,请耐心等待。 +长文本合成分为“普通版”和“情感预测版”,两者需要开通不同的服务,接口地址不同,支持的音色列表也不相同,请仔细阅读文档。 +:::warning +创建合成任务的频率限制为10 QPS,请勿一次性提交过多任务。 +本产品不适合对于时效性有强需求的场景,如有需求建议接入语音合成(短文本)接口。 +::: + +# 鉴权 +请求接口时,需要携带`Resource-Id`和`Authorization`两个header,缺一不可。 +> 参考文档:[鉴权方法](/docs/6561/1105162) + + +# 创建合成任务 + +## 请求参数 + +| | | \ +|服务类型 |接口地址 | +|---|---| +| | | \ +|普通版 |https://openspeech.bytedance.com/api/v1/tts_async/submit | +| | | \ +|情感预测版 |https://openspeech.bytedance.com/api/v1/tts_async_with_emotion/submit | + +**请求方式:`POST`** +**Content-Type:** `application/json` +**请求参数说明:** + +| | | | | \ +|参数名称 |参数类型 |是否必需 |描述 | +|---|---|---|---| +| | | | | \ +|appid |string |Y |Appid从控制台获取 | +| | | | | \ +|reqid |string |Y |Request ID,不可重复,长度20~64,建议使用uuid | +| | | | | \ +|text |string |Y |合成文本,长度小于10万字符,支持SSML。SSML需要以开头和结束,且全文只出现一组标签,支持的SSML标签可参考[SSML标记语言](/docs/6561/104897) | +| | | | | \ +|format |string |Y |输出音频格式,支持pcm/wav/mp3/ogg_opus | +| | | | | \ +|voice_type |string |Y |音色voice_type,见[音色列表](/docs/6561/1108211) | +| | | | | \ +|voice |string |N |音色voice,情感预测版voice为空时,使用预测结果;voice不为空时,使用指定的voice;其余情况使用默认voice | +| | | | | \ +|language |string |N |语种,与音色有关,具体值参考[音色列表](/docs/6561/1108211),默认为中文 | +| | | | | \ +|sample_rate |int |N |采样率,默认为24000 | +| | | | | \ +|volume |float |N |音量,范围0.1~3,默认为1 | +| | | | | \ +|speed |float |N |语速,范围0.2~3,默认为1 | +| | | | | \ +|pitch |float |N |语调,范围0.1~3,默认为1 | +| | | | | \ +|enable_subtitle |int |N |是否开启字幕时间戳,0表示不开启,1表示开启**句级别**字幕时间戳,2表示开启**字词级别**时间戳,3表示开启**音素级别**时间戳 | +| | | | | \ +|sentence_interval |int |N |句间停顿,单位毫秒,范围0~3000,默认为预测值 | +| | | | | \ +|style |string |N |指定情感,“情感预测版”默认为预测值,“普通版”默认为音色默认值,音色支持的情感见[音色列表](/docs/6561/1108211) | +| | | | | \ +|callback_url |string |N |回调返回地址建议使用域名方式; | + +:::warning +在 “情感预测版”接口中使用不支持多情感的音色,将会合成失败。是否支持多情感见[音色列表](/docs/6561/1108211) +::: +**请求参数示例:** +```json +{ + "appid": "123456", + "text": "火山引擎异步长文本合成。", + "format": "mp3", + "voice_type": "BV701_streaming", + "sample_rate": 24000, + "volume": 1.2, + "speed": 0.9, + "pitch": 1.1, + "enable_subtitle": 1, + "callback_url": "http://x.y.z/callback" +} +``` + + +## 返回结果 +**返回结果示例:** +请求成功: +```json +{ + "task_id": "bd0c2171-4b38-4c05-b685-11f3d240ee8d", + "task_status": 0, + "text_length": 12 +} +``` + +请求失败: +```json +{ + "reqid": "e8f41275-72a3-45b5-af3c-61047f406cac", + "code": 40000, + "message": "请求参数错误:text不能为空" +} +``` + +**返回参数说明:** + +| | | | \ +|参数名称 |类型 |描述 | +|---|---|---| +| | | | \ +|task_id |string |任务ID,**注意保存,用于查询合成结果** | +| | | | \ +|task_status |int |任务状态,0-合成中,1-合成成功,2-合成失败 | +| | | | \ +|text_length |int |合成需要消耗的字符数,含标点符号 | +| | | | \ +|code |int |错误码,参考[错误码说明](/docs/6561/1096680#错误码说明) | +| | | | \ +|message |string |错误信息 | + + +# 查询合成结果 + +## 请求参数 + +| | | \ +|服务类型 |接口地址 | +|---|---| +| | | \ +|普通版 |https://openspeech.bytedance.com/api/v1/tts_async/query | +| | | \ +|情感预测版 |https://openspeech.bytedance.com/api/v1/tts_async_with_emotion/query | + +**请求方式:`GET`** +**请求参数说明:** + +| | | | | \ +|参数名称 |参数类型 |是否必需 |描述 | +|---|---|---|---| +| | | | | \ +|appid |string |Y |Appid从控制台获取 | +| | | | | \ +|task_id |string |Y |创建合成任务时返回的task_id | + +**请求参数示例:** +```GET +https://openspeech.bytedance.com/api/v1/tts_async/query?appid=123456&task_id=bd0c2171-4b38-4c05-b685-11f3d240ee8d +``` + + +## 返回结果 +**返回结果示例:** +请求成功: +```json +{ + "task_id": "bd0c2171-4b38-4c05-b685-11f3d240ee8d", + "task_status": 1, + "text_length": 12, + "audio_url": "https://lf9-lab-speech-tt-sign.bytetos.com/tos-cn-o-14155/aef41ebf89124edba16d4e97e455e007?x-expires=1687778318&x-signature=SJub692wmwsxboJTgl2VX55tIzY%3D", + "url_expire_time": 1687777943, + "sentences": [ + { + "text": "火山引擎异步长文本合成。", + "origin_text": "火山引擎异步长文本合成。", + "paragraph_no": 1, + "begin_time": 0, + "end_time": 4211, + "emotion": "neutral" + "words": [ + { + "text": "火", + "begin": 25, + "end": 235, + "phonemes": [ + { "ph": "C0h", "begin": 25, "end": 130 }, + { "ph": "C0uo", "begin": 130, "end": 235 } + ] + }, + { + "text": "山", + "begin": 235, + "end": 495, + "phonemes": [ + { "ph": "C0sh", "begin": 235, "end": 345 }, + { "ph": "C0an", "begin": 345, "end": 495 } + ] + }, + ... + ] + } + ] +} +``` + +请求失败: +```json +{ + "reqid": "bd0c2171-4b38-4c05-b685-11f3d240ee8d", + "code": 40001, + "message": "没有可以合成的有效字符" +} +``` + +**返回参数说明:** + +| | | | \ +|参数名称 |类型 |描述 | +|---|---|---| +| | | | \ +|task_id |string |任务ID | +| | | | \ +|task_status |int |任务状态,0-合成中,1-合成成功,2-合成失败 | +| | | | \ +|text_length |int |合成消耗的字符数,含标点符号 | +| | | | \ +|audio_url |string |音频URL,**有效期为1个小时,请及时下载** | +| | | | \ +|url_expire_time |int |音频URL过期时间(UNIX时间戳) | +| | | | \ +|sentences |List |分句信息,enable_subtitle≥1才会返回 | +| | | | \ +|sentences.text |string |实际合成的文本,会过滤掉一些符号、表情和无法合成的字符 | +| | | | \ +|sentences.origin_text |string |原文分句,所有句子拼起来与输入文本完全一致 | +| | | | \ +|sentences.paragraph_no |int |分句所属段落,以换行符\n或

划分段落 | +| | | | \ +|sentences.begin_time |int |分句开始时间,单位:毫秒 | +| | | | \ +|sentences.end_time |int |分句结束时间,单位:毫秒 | +| | | | \ +|sentences.emotion |string |分句情感,“情感预测版”才会返回 | +| | | | \ +|sentences.words |List |字词信息,enable_subtitle≥2才会返回 | +| | | | \ +|sentences.words.text |string |字词文本 | +| | | | \ +|sentences.words.begin |int |字词开始时间,单位:毫秒 | +| | | | \ +|sentences.words.end |int |字词结束时间,单位:毫秒 | +| | | | \ +|sentences.words.phonemes |List |音素信息,enable_subtitle=3才会返回 | +| | | | \ +|sentences.words.phonemes.ph |string |音素 | +| | | | \ +|sentences.words.phonemes.begin |int |音素开始时间,单位:毫秒 | +| | | | \ +|sentences.words.phonemes.end |int |音素结束时间,单位:毫秒 | + +:::warning +1. 合成结果保留7天,7天内都可以通过该接口查询合成结果,过期后自动删除。 +2. 下载URL有效期为1小时,请勿直接保存audio_url,应及时下载音频或转存至你的云存储中。 +3. audio_url过期后(状态码401或403),可重新请求查询接口获取新的URL。 +::: + +# 错误码说明 + +| | | | \ +|错误码 |错误码描述 |解决办法 | +|---|---|---| +| | | | \ +|40000 |请求参数错误 |根据返回的message检查请求参数 | +| | | | \ +|40001 |没有可以合成的有效字符 |检查请求参数中的text | +| | | | \ +|40002 |该音色不支持多情感 |可用音色见[音色列表](/docs/6561/1108211) ,或使用“普通版”合成 | +| | | | \ +|40300 |试用额度不足 |开通正式版服务 | +| | | | \ +|40400 |任务不存在或已过期 |检查task_id是否正确 | +| | | | \ +|50000 |服务器错误 |建议先重试,重试无效请联系客服 | +| | | | \ +|50001 |合成失败 |建议先重试,重试无效请联系客服 | +| | | | \ +|50002 |生成下载URL失败 |建议先重试,重试无效请联系客服 | + + +# 结果回调 +如果“创建合成任务”时传入了**callback_url**,服务器将会在合成成功/失败时,以接口回调的方式通知用户。 +**请求方式:`POST`** +**Content-Type:** `application/json` +**回调参数示例:** +合成成功: +```json +{ + "code": 0, + "message": "Success" + "task_id": "bd0c2171-4b38-4c05-b685-11f3d240ee8d", + "task_status": 1, + "text_length": 12, + "audio_url": "https://lf9-lab-speech-tt-sign.bytetos.com/tos-cn-o-14155/aef41ebf89124edba16d4e97e455e007?x-expires=1687778318&x-signature=SJub692wmwsxboJTgl2VX55tIzY%3D", + "url_expire_time": 1687777943, + "sentences": [ + ... + ] +} +``` + +合成失败: +```json +{ + "code": 40001, + "message": "没有可以合成的有效字符", + "task_id": "bd0c2171-4b38-4c05-b685-11f3d240ee8d", + "task_status": 2, + "text_length": 12 +} +``` + +:::warning +不保证回调成功,建议在提交任务一定时间后(如3个小时)仍未收到回调,则主动请求“查询合成结果”接口。 +::: + diff --git a/API相关/豆包语音模型-音色列表.md b/API相关/豆包语音模型-音色列表.md new file mode 100644 index 0000000..14f8333 --- /dev/null +++ b/API相关/豆包语音模型-音色列表.md @@ -0,0 +1,47 @@ +:::warning +精品长文本合成包含两种方案,分别为“**普通版(不支持情感预测)**”和“**情感预测版**” +::: + +# **情感预测版**-音色列表 + +* 多情感配置信息请详见:[音色列表--豆包语音-火山引擎](https://www.volcengine.com/docs/6561/97465) + + +| | | \ +|推荐音色 |voice_type | +|---|---| +| | | \ +|擎苍 |BV701_streaming | +| | | \ +|阳光青年 |BV123_streaming | +| | | \ +|反卷青年 |BV120_streaming | +| | | \ +|通用赘婿 |BV119_streaming | +| | | \ +|古风少御 |BV115_streaming | +| | | \ +|霸气青叔 |BV107_streaming | +| | | \ +|质朴青年 |BV100_streaming | +| | | \ +|温柔淑女 |BV104_streaming | +| | | \ +|开朗青年 |BV004_streaming | +| | | \ +|甜宠少御 |BV113_streaming | +| | | \ +|儒雅青年 |BV102_streaming | + + +# **普通版(不支持情感预测)**-音色列表 + +* 普通版音色与语音合成中的**音色一致**,音色信息请详见:[音色列表--豆包语音-火山引擎](https://www.volcengine.com/docs/6561/97465) + + +# FAQ +**Q1:精品长文本语音合成产品支持哪些情感预测** +可以自动区分旁白和对话。其中,对话可以支持七大情感:开心、悲伤、愤怒、害怕、厌恶、惊讶、平和 +**Q2:精品长文本语音合成产品是否可以支持ssml标签** +精品长文本语音支持ssml标签 + diff --git a/FLUTTER_WEB_DEV_GUIDE.md b/FLUTTER_WEB_DEV_GUIDE.md new file mode 100644 index 0000000..a24754e --- /dev/null +++ b/FLUTTER_WEB_DEV_GUIDE.md @@ -0,0 +1,130 @@ +# Flutter Web 本地调试启动指南 + +> 本文档供 AI 编码助手阅读,用于在本项目中正确启动 Flutter Web 调试环境。 + +## 项目结构 + +- Flutter 应用目录:`airhub_app/` +- 后端服务入口:`server.py`(根目录,FastAPI + Uvicorn,端口 3000) +- 前端端口:`8080` + +## 环境要求 + +- Flutter SDK(3.x) +- Python 3.x(后端服务) +- PowerShell(Windows 环境) + +## 操作系统 + +Windows(所有命令均为 PowerShell 语法) + +--- + +## 启动流程(严格按顺序执行) + +### 1. 杀掉旧进程并确认端口空闲 + +```powershell +# 杀掉占用 8080 和 3000 的旧进程 +Get-NetTCPConnection -LocalPort 8080 -ErrorAction SilentlyContinue | ForEach-Object { taskkill /F /PID $_.OwningProcess 2>$null } +Get-NetTCPConnection -LocalPort 3000 -ErrorAction SilentlyContinue | ForEach-Object { taskkill /F /PID $_.OwningProcess 2>$null } + +# 等待端口释放 +Start-Sleep -Seconds 3 + +# 确认端口已空闲(无输出 = 空闲) +Get-NetTCPConnection -LocalPort 8080 -ErrorAction SilentlyContinue +Get-NetTCPConnection -LocalPort 3000 -ErrorAction SilentlyContinue +``` + +### 2. 启动后端服务器(音乐生成功能依赖此服务) + +```powershell +# 工作目录:项目根目录 +cd d:\Airhub +python server.py +``` + +成功标志: +``` +INFO: Uvicorn running on http://0.0.0.0:3000 (Press CTRL+C to quit) +[Server] Music Server running on http://localhost:3000 +``` + +### 3. 设置国内镜像源 + 启动 Flutter Web Server + +```powershell +# 工作目录:airhub_app 子目录 +cd d:\Airhub\airhub_app + +# 设置镜像源(必须,否则网络超时) +$env:PUB_HOSTED_URL = "https://pub.flutter-io.cn" +$env:FLUTTER_STORAGE_BASE_URL = "https://storage.flutter-io.cn" + +# 启动 web-server 模式 +flutter run -d web-server --web-port=8080 --no-pub +``` + +成功标志: +``` +lib\main.dart is being served at http://localhost:8080 +``` + +### 4. 访问应用 + +浏览器打开:`http://localhost:8080` + +--- + +## 关键规则 + +### 必须使用 `web-server` 模式 +- **禁止**使用 `flutter run -d chrome`(会弹出系统 Chrome 窗口,不可控) +- **必须**使用 `flutter run -d web-server`(只启动 HTTP 服务,手动用浏览器访问) + +### `--no-pub` 的使用条件 +- 仅修改 Dart 代码(无新依赖、无新 asset)→ 加 `--no-pub`,编译更快 +- 新增了 `pubspec.yaml` 依赖或 `assets/` 资源文件 → **不能**加 `--no-pub` + +### 端口管理 +- 固定使用 8080(Flutter)和 3000(后端),不要换端口绕过占用 +- 每次启动前必须先确认端口空闲 +- 停止服务后等 3 秒再重新启动 + +### 热重载 +- 在 Flutter 终端按 `r` = 热重载(保留页面状态) +- 按 `R` = 热重启(重置页面状态) +- 浏览器 `Ctrl+Shift+R` = 强制刷新 + +--- + +## 停止服务 + +```powershell +# 方法1:在 Flutter 终端按 q 退出 + +# 方法2:强制杀进程 +Get-NetTCPConnection -LocalPort 8080 | ForEach-Object { taskkill /F /PID $_.OwningProcess } +Get-NetTCPConnection -LocalPort 3000 | ForEach-Object { taskkill /F /PID $_.OwningProcess } +``` + +--- + +## 常见问题排查 + +| 问题 | 原因 | 解决方案 | +|------|------|---------| +| 端口被占用 | 旧进程未退出 | 执行第1步杀进程,等3秒 | +| 编译报错找不到包 | 使用了 `--no-pub` 但有新依赖 | 去掉 `--no-pub` 重新编译 | +| 网络超时 | 未设置镜像源 | 设置 `PUB_HOSTED_URL` 和 `FLUTTER_STORAGE_BASE_URL` | +| 页面白屏 | 缓存问题 | 浏览器 `Ctrl+Shift+R` 强刷 | +| 音乐功能不工作 | 后端未启动 | 先启动 `python server.py` | + +--- + +## 编译耗时参考 + +- 首次完整编译(含 pub get):90-120 秒 +- 增量编译(`--no-pub`):60-90 秒 +- 热重载(按 r):3-5 秒 +- 热重启(按 R):10-20 秒 diff --git a/airhub_app/lib/pages/device_control_page.dart b/airhub_app/lib/pages/device_control_page.dart index a0ca392..4ec693a 100644 --- a/airhub_app/lib/pages/device_control_page.dart +++ b/airhub_app/lib/pages/device_control_page.dart @@ -1,8 +1,10 @@ +import 'dart:convert'; import 'dart:math'; import 'dart:ui'; import 'package:flutter/material.dart'; import 'package:google_fonts/google_fonts.dart'; import 'package:flutter_svg/flutter_svg.dart'; +import 'package:http/http.dart' as http; import 'story_detail_page.dart'; import 'product_selection_page.dart'; import 'settings_page.dart'; @@ -89,6 +91,45 @@ class _DeviceControlPageState extends State _bookshelfScrollOffset = _bookshelfController.page ?? 0.0; }); }); + + // Load historical stories from backend + _loadHistoricalStories(); + } + + /// Fetch saved stories from backend and prepend to bookshelf + Future _loadHistoricalStories() async { + try { + final resp = await http.get(Uri.parse('http://localhost:3000/api/stories')); + if (resp.statusCode == 200) { + final data = jsonDecode(resp.body); + final List stories = data['stories'] ?? []; + if (stories.isEmpty) return; + + // Collect titles already in the mock list to avoid duplicates + final existingTitles = _mockStories.map((s) => s['title'] as String).toSet(); + + final newStories = >[]; + for (final s in stories) { + final title = s['title'] as String? ?? ''; + if (title.isNotEmpty && !existingTitles.contains(title)) { + newStories.add({ + 'title': title, + 'cover': null, // No cover yet for generated stories + 'locked': false, + 'content': s['content'] as String? ?? '', + }); + } + } + + if (newStories.isNotEmpty && mounted) { + setState(() { + _mockStories.addAll(newStories); + }); + } + } + } catch (e) { + debugPrint('Failed to load historical stories: $e'); + } } @override @@ -120,7 +161,7 @@ class _DeviceControlPageState extends State children: [ SafeArea(bottom: false, child: _buildHomeView()), SafeArea(bottom: false, child: _buildStoryView()), - const MusicCreationPage(isTab: true), + MusicCreationPage(isTab: true, isVisible: _currentIndex == 2), const ProfilePage(), // No SafeArea here to allow full background ], ), @@ -411,19 +452,28 @@ class _DeviceControlPageState extends State colors: AppColors.btnCapybaraGradient, ), onPressed: () async { - final result = await showModalBottomSheet( + final result = await showModalBottomSheet>( context: context, isScrollControlled: true, backgroundColor: Colors.transparent, builder: (context) => const StoryGeneratorModal(), ); - if (result == 'start_generation') { + if (result != null && result['action'] == 'start_generation') { final saveResult = await Navigator.of(context).push( - MaterialPageRoute(builder: (context) => const StoryLoadingPage()), + MaterialPageRoute( + builder: (context) => StoryLoadingPage( + characters: List.from(result['characters'] ?? []), + scenes: List.from(result['scenes'] ?? []), + props: List.from(result['props'] ?? []), + ), + ), ); - if (saveResult == 'saved') { - _addNewBookWithAnimation(); + if (saveResult is Map && saveResult['action'] == 'saved') { + _addNewBookWithAnimation( + title: saveResult['title'] as String? ?? '新故事', + content: saveResult['content'] as String? ?? '', + ); } } }, @@ -520,26 +570,35 @@ class _DeviceControlPageState extends State } Widget _buildStorySlot(Map story, {bool isNew = false}) { - bool isFilled = story.containsKey('cover') && story['cover'] != null; + final bool hasCover = story['cover'] != null && (story['cover'] as String).isNotEmpty; + final bool hasContent = story['content'] != null && (story['content'] as String).isNotEmpty; - // Empty/Clickable Slot (.story-slot.clickable) - // PRD: border: 1px dashed rgba(0, 0, 0, 0.05) - if (!isFilled) { + // Empty/Clickable Slot — no content, just a "+" to create new story + if (!hasContent && !hasCover) { return GestureDetector( onTap: () async { - final result = await showModalBottomSheet( + final result = await showModalBottomSheet>( context: context, isScrollControlled: true, backgroundColor: Colors.transparent, builder: (context) => const StoryGeneratorModal(), ); - if (result == 'start_generation') { + if (result != null && result['action'] == 'start_generation') { final saveResult = await Navigator.of(context).push( - MaterialPageRoute(builder: (context) => const StoryLoadingPage()), + MaterialPageRoute( + builder: (context) => StoryLoadingPage( + characters: List.from(result['characters'] ?? []), + scenes: List.from(result['scenes'] ?? []), + props: List.from(result['props'] ?? []), + ), + ), ); - if (saveResult == 'saved') { - _addNewBookWithAnimation(); + if (saveResult is Map && saveResult['action'] == 'saved') { + _addNewBookWithAnimation( + title: saveResult['title'] as String? ?? '新故事', + content: saveResult['content'] as String? ?? '', + ); } } }, @@ -560,6 +619,41 @@ class _DeviceControlPageState extends State ); } + // Cover widget: real image or "未生成封面" placeholder + Widget coverWidget; + if (hasCover) { + coverWidget = Image.asset( + story['cover'], + fit: BoxFit.cover, + errorBuilder: (_, __, ___) => Container(color: Colors.grey.shade200), + ); + } else { + // No cover — show soft placeholder + coverWidget = Container( + decoration: BoxDecoration( + gradient: LinearGradient( + begin: Alignment.topCenter, + end: Alignment.bottomCenter, + colors: [ + const Color(0xFFE8E0F0), + const Color(0xFFD5CBE8), + ], + ), + ), + alignment: Alignment.center, + padding: const EdgeInsets.symmetric(horizontal: 12), + child: const Text( + '暂无封面', + style: TextStyle( + fontSize: 11, + color: Color(0xFF9B8DB8), + fontWeight: FontWeight.w500, + ), + textAlign: TextAlign.center, + ), + ); + } + // Filled Slot (.story-slot.filled) Widget slotContent = GestureDetector( onTap: () { @@ -578,15 +672,8 @@ class _DeviceControlPageState extends State clipBehavior: Clip.antiAlias, child: Stack( children: [ - // Cover Image (.story-cover-img) - Positioned.fill( - child: Image.asset( - story['cover'], - fit: BoxFit.cover, - errorBuilder: (_, __, ___) => - Container(color: Colors.grey.shade200), - ), - ), + // Cover Image or Placeholder + Positioned.fill(child: coverWidget), // Title Bar (.story-title-bar) Positioned( bottom: 0, @@ -808,14 +895,14 @@ class _DeviceControlPageState extends State ); } - void _addNewBookWithAnimation() { + void _addNewBookWithAnimation({String title = '新故事', String content = ''}) { setState(() { _mockStories.add({ - 'title': '星际忍者的茶话会', - 'cover': - 'assets/www/story_covers/brave_tailor.png', // Temporary mock cover + 'title': title, + 'cover': null, // No cover yet for generated stories 'type': 'new', 'locked': false, + 'content': content, }); _newBookIndex = _mockStories.length - 1; }); diff --git a/airhub_app/lib/pages/music_creation_page.dart b/airhub_app/lib/pages/music_creation_page.dart index b03eb79..5145f3b 100644 --- a/airhub_app/lib/pages/music_creation_page.dart +++ b/airhub_app/lib/pages/music_creation_page.dart @@ -3,9 +3,11 @@ import 'dart:ui'; import 'package:flutter/material.dart'; import 'package:google_fonts/google_fonts.dart'; import 'package:just_audio/just_audio.dart'; +import '../services/music_generation_service.dart'; import '../widgets/animated_gradient_background.dart'; import '../widgets/ios_toast.dart'; import '../widgets/gradient_button.dart'; +import '../widgets/glass_dialog.dart'; import '../theme/app_colors.dart' as appclr; // ============================================================ @@ -19,20 +21,27 @@ class _Track { final String title; final String lyrics; String audioAsset; + final bool isRemote; // true = URL from server, false = local asset _Track({ required this.id, required this.title, required this.lyrics, required this.audioAsset, + this.isRemote = false, }); } +/// Server base URL — change this when deploying + class MusicCreationPage extends StatefulWidget { /// Whether this page is embedded as a tab (hides back button) final bool isTab; - const MusicCreationPage({super.key, this.isTab = true}); + /// Whether this page is currently visible (for tab-based navigation) + final bool isVisible; + + const MusicCreationPage({super.key, this.isTab = true, this.isVisible = true}); @override State createState() => _MusicCreationPageState(); @@ -43,6 +52,7 @@ class _MusicCreationPageState extends State // ── State ── bool _isPlaying = false; bool _isGenerating = false; + double _genProgress = 0.0; // 0~100, generation progress ring bool _isFlipped = false; int? _selectedMoodIndex; double _progress = 0.0; @@ -103,14 +113,44 @@ class _MusicCreationPageState extends State ), ]; - // ── Mood cards ── + // ── Mood cards — prompt 设计为宽泛场景,保证同一卡片每次生成不同 ── static const List> _moods = [ - {'icon': Icons.spa_outlined, 'color': 0xFFB8D4E3, 'title': 'Chill Lofi', 'desc': '慵懒 · 治愈 · 水声'}, - {'icon': Icons.directions_run, 'color': 0xFFF5C6A5, 'title': 'Happy Funk', 'desc': '活力 · 奔跑 · 阳光'}, - {'icon': Icons.nights_stay_outlined, 'color': 0xFFCBB8E0, 'title': 'Deep Sleep', 'desc': '白噪音 · 助眠 · 梦境'}, - {'icon': Icons.psychology_outlined, 'color': 0xFFA8D8C8, 'title': 'Focus Flow', 'desc': '心流 · 专注 · 效率'}, - {'icon': Icons.redeem_outlined, 'color': 0xFFD4A0E8, 'title': '盲盒惊喜', 'desc': 'AI 随机生成神曲'}, - {'icon': Icons.auto_awesome, 'color': 0xFFECCFA8, 'title': '自由创作', 'desc': '输入灵感 · 生成音乐'}, + { + 'icon': Icons.spa_outlined, 'color': 0xFFB8D4E3, + 'title': 'Chill Lofi', 'desc': '慵懒 · 治愈 · 水声', + 'prompt': '慵懒的午后,泡在温泉里听水声发呆,什么都不想做', + 'mood': 'chill', + }, + { + 'icon': Icons.directions_run, 'color': 0xFFF5C6A5, + 'title': 'Happy Funk', 'desc': '活力 · 奔跑 · 阳光', + 'prompt': '阳光灿烂的日子,在草地上奔跑撒欢,心情超级好', + 'mood': 'happy', + }, + { + 'icon': Icons.nights_stay_outlined, 'color': 0xFFCBB8E0, + 'title': 'Deep Sleep', 'desc': '白噪音 · 助眠 · 梦境', + 'prompt': '夜深了,窗外下着小雨,盖着被子准备入睡', + 'mood': 'sleepy', + }, + { + 'icon': Icons.psychology_outlined, 'color': 0xFFA8D8C8, + 'title': 'Focus Flow', 'desc': '心流 · 专注 · 效率', + 'prompt': '安静的书房里,沏一杯茶,沉浸在自己的世界', + 'mood': 'chill', + }, + { + 'icon': Icons.redeem_outlined, 'color': 0xFFD4A0E8, + 'title': '盲盒惊喜', 'desc': 'AI 随机生成神曲', + 'prompt': '', // 空 prompt,让 LLM 自由发挥 + 'mood': 'random', + }, + { + 'icon': Icons.auto_awesome, 'color': 0xFFECCFA8, + 'title': '自由创作', 'desc': '输入灵感 · 生成音乐', + 'prompt': '', // 用户自定义输入 + 'mood': 'custom', + }, ]; @override @@ -194,6 +234,75 @@ class _MusicCreationPageState extends State // Pre-load the first track (don't auto-play) _loadTrack(_currentTrackIndex); + + // ── Bind to generation service & check for pending results ── + _bindGenServiceCallbacks(); + + // If generation was running while we were away, restore UI state + if (_genService.isGenerating) { + _isGenerating = true; + _genProgress = _genService.progress; + _showSpeech(_genService.statusMessage, duration: 0); + } + + // If a song was generated while we were away, show dialog (don't auto-play) + final pending = _genService.consumePendingResult(); + if (pending != null) { + WidgetsBinding.instance.addPostFrameCallback((_) { + if (mounted) _handlePendingResult(pending); + }); + } + + // If generation failed while we were away, show error bubble + final pendingError = _genService.consumePendingError(); + if (pendingError != null) { + WidgetsBinding.instance.addPostFrameCallback((_) { + if (mounted) { + setState(() { + _isGenerating = false; + _genProgress = 0; + _genStickyText = null; + _selectedMoodIndex = null; + }); + _showSpeech(pendingError); + } + }); + } + + // ── Load historical songs from server ── + _loadHistoricalSongs(); + } + + // ── Load historical songs from server into playlist ── + Future _loadHistoricalSongs() async { + final songs = await _genService.fetchPlaylist(); + if (!mounted || songs.isEmpty) return; + + // Collect titles already in playlist to avoid duplicates + final existingTitles = _playlist.map((t) => t.title).toSet(); + + final newTracks = <_Track>[]; + for (final song in songs) { + if (existingTitles.contains(song.title)) continue; + newTracks.add(_Track( + id: DateTime.now().millisecondsSinceEpoch + newTracks.length, + title: song.title, + lyrics: song.lyrics, + audioAsset: song.audioUrl, + isRemote: true, + )); + } + + if (newTracks.isEmpty) return; + + setState(() { + // Insert server songs at the beginning (before hardcoded tracks) + _playlist.insertAll(0, newTracks); + // Shift current track index so it still points to the same track + _currentTrackIndex += newTracks.length; + }); + + debugPrint('Loaded ${newTracks.length} historical songs from server'); } // ── Duration formatter ── @@ -207,7 +316,13 @@ class _MusicCreationPageState extends State Future _loadTrack(int index) async { try { final track = _playlist[index]; - await _audioPlayer.setAsset(track.audioAsset); + if (track.isRemote) { + // Server-generated track — load from URL + await _audioPlayer.setUrl(track.audioAsset); + } else { + // Local preset track — load from assets + await _audioPlayer.setAsset(track.audioAsset); + } } catch (e) { debugPrint('Error loading track: $e'); if (mounted) { @@ -222,8 +337,61 @@ class _MusicCreationPageState extends State _playTrack(nextIndex); } + @override + void didUpdateWidget(covariant MusicCreationPage oldWidget) { + super.didUpdateWidget(oldWidget); + // When page becomes visible again (tab switch back) + if (widget.isVisible && !oldWidget.isVisible) { + // Re-bind callbacks + _bindGenServiceCallbacks(); + + // If generation is still running, restore progress UI + crawl animation + if (_genService.isGenerating) { + final currentProgress = _genService.progress; + final currentStage = _genService.currentStage; + setState(() { + _isGenerating = true; + _genProgress = currentProgress; + }); + _showSpeech(_genService.statusMessage, duration: 0); + + // Restart crawl animation based on current stage + if (currentStage == 'lyrics') { + _crawlProgress(currentProgress, 25, 8000); + } else if (currentStage == 'music') { + _crawlProgress(currentProgress, 85, 60000); + } + } + + // If a song finished while we were away, show the dialog after build + final pending = _genService.consumePendingResult(); + if (pending != null) { + WidgetsBinding.instance.addPostFrameCallback((_) { + if (mounted) _handlePendingResult(pending); + }); + } + + // If generation failed while we were away, show error bubble + final pendingError = _genService.consumePendingError(); + if (pendingError != null) { + setState(() { + _isGenerating = false; + _genProgress = 0; + _genStickyText = null; + _selectedMoodIndex = null; + }); + _showSpeech(pendingError); + } + } + // When page becomes hidden (tab switch away) + if (!widget.isVisible && oldWidget.isVisible) { + _unbindGenServiceCallbacks(); + } + } + @override void dispose() { + _unbindGenServiceCallbacks(); _audioPlayer.dispose(); _vinylSpinController.dispose(); _tonearmController.dispose(); @@ -323,48 +491,185 @@ class _MusicCreationPageState extends State } setState(() => _selectedMoodIndex = index); - _mockGenerate(_moods[index]['title'] ?? ''); + final mood = _moods[index]; + _generateMusic( + text: (mood['prompt'] as String).isNotEmpty + ? mood['prompt'] as String + : '咔咔今天想来点惊喜', + mood: mood['mood'] as String, + ); } - // ── Mock Generation (matches HTML network-error fallback) ── - void _mockGenerate(String title) async { - setState(() => _isGenerating = true); - _showSpeech('🎼 正在连接 AI...', duration: 0); + // ── Generation via singleton service (survives page navigation) ── + final _genService = MusicGenerationService.instance; - await Future.delayed(const Duration(milliseconds: 800)); - if (!mounted) return; - _showSpeech('🎵 正在生成音乐...', duration: 0); + void _bindGenServiceCallbacks() { + _genService.onProgress = (progress, stage, message) { + if (!mounted) return; + setState(() { + _genProgress = progress; + _isGenerating = true; + }); + _showSpeech(message, duration: 0); - await Future.delayed(const Duration(milliseconds: 1200)); - if (!mounted) return; - _showSpeech('✨ (演示模式) 新歌出炉!'); + // Start crawl animations for long stages + if (stage == 'lyrics') _crawlProgress(10, 25, 8000); + if (stage == 'music') _crawlProgress(30, 85, 120000); + }; - await Future.delayed(const Duration(milliseconds: 500)); - if (!mounted) return; + _genService.onComplete = (result) { + if (!mounted || !widget.isVisible) return; + // Page is visible — consume the pending result and handle it + _genService.consumePendingResult(); + _handleGenResult(result); + }; + _genService.onError = (error) { + if (!mounted) return; + _showSpeech(error); + setState(() { + _isGenerating = false; + _genProgress = 0; + _genStickyText = null; + _selectedMoodIndex = null; + }); + }; + } + + void _unbindGenServiceCallbacks() { + _genService.onProgress = null; + _genService.onComplete = null; + _genService.onError = null; + } + + void _generateMusic({required String text, required String mood}) { + setState(() { + _isGenerating = true; + _genProgress = 5; + }); + _showSpeech('正在连接 AI...', duration: 0); + _genService.generate(text: text, mood: mood); + } + + /// Handle a pending result when user returns to the page — always ask, never auto-play. + void _handlePendingResult(MusicGenResult result) { setState(() { _isGenerating = false; - _selectedMoodIndex = null; // 生成完成,取消选中态 + _genProgress = 0; + _genStickyText = null; + _selectedMoodIndex = null; + }); + + final newTrack = _Track( + id: DateTime.now().millisecondsSinceEpoch, + title: result.title, + lyrics: result.lyrics, + audioAsset: result.audioUrl, + isRemote: true, + ); + + setState(() { + _playlist.insert(0, newTrack); + }); + + // Always show dialog, never auto-play + _showConfirmDialog(newTrack.title); + } + + /// Handle a completed generation result (live — user is on the page). + void _handleGenResult(MusicGenResult result) { + setState(() { + _isGenerating = false; + _genProgress = 0; + _genStickyText = null; + _selectedMoodIndex = null; + }); + + final newTrack = _Track( + id: DateTime.now().millisecondsSinceEpoch, + title: result.title, + lyrics: result.lyrics, + audioAsset: result.audioUrl, + isRemote: true, + ); + + setState(() { + _playlist.insert(0, newTrack); }); - // If already playing, show confirm dialog; otherwise auto-play if (_isPlaying) { - _showConfirmDialog(title); + _showConfirmDialog(newTrack.title); } else { - if (!_isPlaying) _togglePlay(); + _playTrack(0); } } + // ── Crawl progress: slowly animate from→to over durationMs ── + int _crawlId = 0; // Cancel token — only the latest crawl runs + + void _crawlProgress(double from, double to, int durationMs) { + _crawlId++; // Invalidate any previous crawl + final myId = _crawlId; + final steps = durationMs ~/ 300; + final increment = (to - from) / steps; + int step = 0; + Future.doWhile(() async { + await Future.delayed(const Duration(milliseconds: 300)); + if (myId != _crawlId) return false; // Cancelled by a newer crawl + if (!mounted || !_isGenerating || _genProgress >= to) return false; + step++; + setState(() => _genProgress = (from + increment * step).clamp(from, to)); + return step < steps && _isGenerating; + }); + } + + // ── Clean lyrics: strip structure tags, JSON artifacts & normalize ── + String _cleanLyrics(String raw) { + String s = raw; + // Replace literal \n with real newlines + s = s.replaceAll(r'\n', '\n'); + // Remove JSON string quote artifacts (" ") + s = s.replaceAll(RegExp(r'"\s*"'), ''); + s = s.replaceAll('"', ''); + // Remove structure tags: [verse 1], [chorus], [outro], [bridge], etc. + s = s.replaceAll( + RegExp(r'\[(verse|chorus|bridge|outro|intro|hook|pre-chorus|interlude|inst)\s*\d*\]\s*', + caseSensitive: false), + '', + ); + // Strip leading/trailing whitespace from each line + s = s.split('\n').map((line) => line.trim()).join('\n'); + // Collapse 3+ newlines into one blank line + s = s.replaceAll(RegExp(r'\n{3,}'), '\n\n'); + return s.trim(); + } + // ── Speech Bubble ── + String? _genStickyText; // Persistent text during generation + void _showSpeech(String text, {int duration = 3000}) { + // If this is a generation-related message (duration == 0), save it as sticky + if (duration == 0 && _isGenerating) { + _genStickyText = text; + } + setState(() { _speechText = text; _speechVisible = true; }); if (duration > 0) { Future.delayed(Duration(milliseconds: duration), () { - if (mounted && _speechText == text) { - setState(() => _speechVisible = false); + if (!mounted) return; + if (_speechText == text) { + // If still generating, restore the sticky generation message + if (_isGenerating && _genStickyText != null) { + setState(() { + _speechText = _genStickyText; + _speechVisible = true; + }); + } else { + setState(() => _speechVisible = false); + } } }); } @@ -510,42 +815,73 @@ class _MusicCreationPageState extends State Widget _buildVinylWrapper() { // HTML: .player-visual-wrapper { perspective: 800px; width: 210px; height: 210px; // filter: drop-shadow(0 20px 40px rgba(0,0,0,0.2)); } - return GestureDetector( - onTap: _flipVinyl, - child: Container( - width: 210, - height: 210, - decoration: BoxDecoration( - shape: BoxShape.circle, - boxShadow: [ - BoxShadow( - color: Colors.black.withOpacity(0.2), - offset: const Offset(0, 20), - blurRadius: 40, - ), - ], - ), - child: AnimatedBuilder( - animation: _flipAnimation, - builder: (context, child) { - final angle = _flipAnimation.value; - final showBack = angle > pi / 2; + return SizedBox( + width: 210, + height: 210, + child: Stack( + clipBehavior: Clip.none, + alignment: Alignment.center, + children: [ + // Vinyl disc (flippable) + GestureDetector( + onTap: _flipVinyl, + child: Container( + width: 210, + height: 210, + decoration: BoxDecoration( + shape: BoxShape.circle, + boxShadow: [ + BoxShadow( + color: Colors.black.withOpacity(0.2), + offset: const Offset(0, 20), + blurRadius: 40, + ), + ], + ), + child: AnimatedBuilder( + animation: _flipAnimation, + builder: (context, child) { + final angle = _flipAnimation.value; + final showBack = angle > pi / 2; - return Transform( - alignment: Alignment.center, - transform: Matrix4.identity() - ..setEntry(3, 2, 0.00125) // perspective ≈ 1/800 - ..rotateY(angle), - child: showBack - ? Transform( - alignment: Alignment.center, - transform: Matrix4.identity()..rotateY(pi), - child: _buildVinylBack(), - ) - : _buildVinylFront(), - ); - }, - ), + return Transform( + alignment: Alignment.center, + transform: Matrix4.identity() + ..setEntry(3, 2, 0.00125) + ..rotateY(angle), + child: showBack + ? Transform( + alignment: Alignment.center, + transform: Matrix4.identity()..rotateY(pi), + child: _buildVinylBack(), + ) + : _buildVinylFront(), + ); + }, + ), + ), + ), + + // Generation progress ring — always on top, regardless of flip + if (_isGenerating || _genProgress > 0) + Positioned( + left: -7, + top: -7, + width: 224, + height: 224, + child: IgnorePointer( + child: AnimatedOpacity( + opacity: _isGenerating ? 1.0 : 0.0, + duration: const Duration(milliseconds: 400), + child: CustomPaint( + painter: _GenProgressRingPainter( + progress: _genProgress / 100.0, + ), + ), + ), + ), + ), + ], ), ); } @@ -705,7 +1041,7 @@ class _MusicCreationPageState extends State padding: const EdgeInsets.all(10), child: Text( track.lyrics.isNotEmpty - ? track.lyrics + ? _cleanLyrics(track.lyrics) : '生成音乐后\n点我看歌词', style: GoogleFonts.dmSans( fontSize: 12, @@ -729,41 +1065,58 @@ class _MusicCreationPageState extends State // ── Speech Bubble ── Widget _buildSpeechBubble() { - // HTML: .capy-speech-bubble { background: rgba(253,247,237,0.93); - // font-size: 12.5px; font-weight: 500; color: #6B4423; } + // HTML: .capy-speech-bubble with clip-path iMessage-style tail at bottom-left + const tailH = 8.0; return AnimatedOpacity( duration: const Duration(milliseconds: 200), opacity: _speechVisible ? 1.0 : 0.0, child: AnimatedScale( duration: const Duration(milliseconds: 350), scale: _speechVisible ? 1.0 : 0.7, - curve: const Cubic(0.34, 1.56, 0.64, 1.0), // HTML bouncy curve + curve: const Cubic(0.34, 1.56, 0.64, 1.0), alignment: Alignment.bottomLeft, - child: Container( - padding: const EdgeInsets.fromLTRB(16, 8, 16, 16), - decoration: BoxDecoration( - color: const Color(0xFFFDF7ED).withOpacity(0.93), - borderRadius: BorderRadius.circular(16), - boxShadow: [ - BoxShadow( - color: const Color(0xFFECCFA8).withOpacity(0.45), - blurRadius: 0.5, + child: Column( + mainAxisSize: MainAxisSize.min, + crossAxisAlignment: CrossAxisAlignment.start, + children: [ + // Bubble body + Container( + padding: const EdgeInsets.symmetric(horizontal: 14, vertical: 8), + decoration: BoxDecoration( + color: const Color(0xFFFDF7ED).withOpacity(0.93), + borderRadius: BorderRadius.circular(14), + boxShadow: [ + BoxShadow( + color: const Color(0xFFECCFA8).withOpacity(0.45), + blurRadius: 0.5, + ), + BoxShadow( + color: const Color(0xFF8B5E3C).withOpacity(0.10), + offset: const Offset(0, 3), + blurRadius: 12, + ), + ], ), - BoxShadow( - color: const Color(0xFF8B5E3C).withOpacity(0.10), - offset: const Offset(0, 3), - blurRadius: 12, + child: Text( + _speechText ?? '', + style: GoogleFonts.dmSans( + fontSize: 12.5, + fontWeight: FontWeight.w500, + color: const Color(0xFF6B4423), + ), ), - ], - ), - child: Text( - _speechText ?? '', - style: GoogleFonts.dmSans( - fontSize: 12.5, - fontWeight: FontWeight.w500, - color: const Color(0xFF6B4423), ), - ), + // Tail (小角角) — bottom-left, matching HTML clip-path tail + Padding( + padding: const EdgeInsets.only(left: 14), + child: CustomPaint( + size: const Size(12, tailH), + painter: _BubbleTailPainter( + color: const Color(0xFFFDF7ED).withOpacity(0.93), + ), + ), + ), + ], ), ), ); @@ -1117,7 +1470,7 @@ class _MusicCreationPageState extends State onSubmit: (text) { Navigator.pop(ctx); setState(() => _selectedMoodIndex = 5); - _mockGenerate(text); + _generateMusic(text: text, mood: 'custom'); }, ), ); @@ -1141,21 +1494,40 @@ class _MusicCreationPageState extends State } // ── Confirm Dialog (new song ready) ── - void _showConfirmDialog(String title) { - showDialog( + void _showConfirmDialog(String songTitle) { + showGeneralDialog( context: context, + barrierDismissible: true, + barrierLabel: 'Dismiss', barrierColor: Colors.black.withOpacity(0.4), - builder: (ctx) => _ConfirmDialogContent( - title: title, - onListen: () { - Navigator.pop(ctx); - _showSpeech('正在播放: $title'); - }, - onLater: () { - Navigator.pop(ctx); - _showSpeech('已加入唱片架,随时可以听'); - }, - ), + transitionDuration: const Duration(milliseconds: 300), + pageBuilder: (ctx, anim1, anim2) { + return GlassDialog( + title: '新歌已生成', + description: '是否立即试听?', + cancelText: '稍后再听', + confirmText: '立即试听', + onCancel: () { + Navigator.of(ctx).pop(); + _showSpeech('已加入唱片架,随时可以听'); + }, + onConfirm: () { + Navigator.of(ctx).pop(); + _playTrack(0); + }, + ); + }, + transitionBuilder: (ctx, anim1, anim2, child) { + return ScaleTransition( + scale: Tween(begin: 0.9, end: 1.0).animate( + CurvedAnimation( + parent: anim1, + curve: const Cubic(0.175, 0.885, 0.32, 1.275), + ), + ), + child: FadeTransition(opacity: anim1, child: child), + ); + }, ); } } @@ -1167,6 +1539,97 @@ class _MusicCreationPageState extends State /// Vinyl disc grooves + conic shine /// HTML: repeating-radial-gradient(#18181B 0, #18181B 3px, #27272A 4px) /// + conic-gradient shine overlay +// ── Bubble Tail Painter (iMessage-style small triangle) ── +class _BubbleTailPainter extends CustomPainter { + final Color color; + _BubbleTailPainter({required this.color}); + + @override + void paint(Canvas canvas, Size size) { + final path = Path() + ..moveTo(0, 0) // top-left (connects to bubble) + ..lineTo(size.width, 0) // top-right + ..lineTo(2, size.height) // bottom point (tail tip) + ..close(); + canvas.drawPath(path, Paint()..color = color); + } + + @override + bool shouldRepaint(_BubbleTailPainter old) => old.color != color; +} + +// ── Circular Generation Progress Ring (matches HTML .gen-ring) ── +class _GenProgressRingPainter extends CustomPainter { + final double progress; // 0.0 ~ 1.0 + + _GenProgressRingPainter({required this.progress}); + + @override + void paint(Canvas canvas, Size size) { + final center = Offset(size.width / 2, size.height / 2); + final radius = 108.0; // HTML: SVG viewBox 224, circle r=108 + final rect = Rect.fromCircle(center: center, radius: radius); + final sweepAngle = 2 * pi * progress; + + // Track (background ring) + final trackPaint = Paint() + ..color = Colors.white.withOpacity(0.12) + ..style = PaintingStyle.stroke + ..strokeWidth = 3; + canvas.drawCircle(center, radius, trackPaint); + + if (progress < 0.001) return; + + // Layer 1: Wide soft outer glow (blurred) — creates the warm halo + final outerGlow = Paint() + ..color = const Color(0xFFECCFA8).withOpacity(0.12) + ..style = PaintingStyle.stroke + ..strokeWidth = 16 + ..strokeCap = StrokeCap.round + ..maskFilter = const MaskFilter.blur(BlurStyle.normal, 8); + canvas.drawArc(rect, -pi / 2, sweepAngle, false, outerGlow); + + // Layer 2: Medium glow — HTML: stroke-width 8, rgba(236,207,168,0.15) + final midGlow = Paint() + ..color = const Color(0xFFECCFA8).withOpacity(0.20) + ..style = PaintingStyle.stroke + ..strokeWidth = 8 + ..strokeCap = StrokeCap.round + ..maskFilter = const MaskFilter.blur(BlurStyle.normal, 3); + canvas.drawArc(rect, -pi / 2, sweepAngle, false, midGlow); + + // Layer 3: Core bar — HTML: stroke-width 3, drop-shadow(0 0 4px) + // Draw shadow pass first + final barShadow = Paint() + ..color = const Color(0xFFECCFA8).withOpacity(0.50) + ..style = PaintingStyle.stroke + ..strokeWidth = 4 + ..strokeCap = StrokeCap.round + ..maskFilter = const MaskFilter.blur(BlurStyle.normal, 4); + canvas.drawArc(rect, -pi / 2, sweepAngle, false, barShadow); + + // Core bar with gradient + final barPaint = Paint() + ..style = PaintingStyle.stroke + ..strokeWidth = 3 + ..strokeCap = StrokeCap.round + ..shader = SweepGradient( + startAngle: -pi / 2, + endAngle: -pi / 2 + sweepAngle, + colors: const [ + Color(0xFFECCFA8), + Color(0xFFD4A76A), + Color(0xFFECCFA8), + ], + stops: const [0.0, 0.5, 1.0], + ).createShader(rect); + canvas.drawArc(rect, -pi / 2, sweepAngle, false, barPaint); + } + + @override + bool shouldRepaint(_GenProgressRingPainter old) => old.progress != progress; +} + class _VinylDiscPainter extends CustomPainter { @override void paint(Canvas canvas, Size size) { @@ -1471,12 +1934,31 @@ class _PlaylistModalContent extends StatelessWidget { @override Widget build(BuildContext context) { + final screenWidth = MediaQuery.of(context).size.width; + final bottomPadding = MediaQuery.of(context).padding.bottom; + + // ── Calculate grid height for 3.5 visible rows ── + // Grid area width = screen - left(20) - right(20) + const double hPad = 20; + const double gap = 8; + const double aspectRatio = 0.75; // childAspectRatio + const double visibleRows = 3.5; + final gridWidth = screenWidth - hPad * 2; + final colWidth = (gridWidth - gap * 2) / 3; // 3 columns, 2 gaps + final cellHeight = colWidth / aspectRatio; + final rowHeight = cellHeight + gap; // cell + mainAxisSpacing + final gridMaxHeight = rowHeight * visibleRows; + + // Header: ~28px row + 16px spacing = 44px + const headerHeight = 44.0; + final totalMaxHeight = headerHeight + gridMaxHeight + 24 + bottomPadding; + return Container( constraints: BoxConstraints( - maxHeight: MediaQuery.of(context).size.height * 0.88, + maxHeight: totalMaxHeight, ), padding: EdgeInsets.fromLTRB( - 20, 16, 20, 24 + MediaQuery.of(context).padding.bottom, + hPad, 16, hPad, 24 + bottomPadding, ), decoration: BoxDecoration( color: Colors.white.withOpacity(0.95), @@ -1522,7 +2004,7 @@ class _PlaylistModalContent extends StatelessWidget { ), const SizedBox(height: 16), - // Record grid — HTML: .record-grid { grid-template-columns: repeat(3, 1fr); gap: 8px; } + // Record grid — shows 3.5 rows, scroll to see more Flexible( child: GridView.builder( shrinkWrap: true, @@ -1645,93 +2127,3 @@ class _PlaylistModalContent extends StatelessWidget { } } -/// Confirm Dialog — HTML: .confirm-container -class _ConfirmDialogContent extends StatelessWidget { - final String title; - final VoidCallback onListen; - final VoidCallback onLater; - - const _ConfirmDialogContent({ - required this.title, - required this.onListen, - required this.onLater, - }); - - @override - Widget build(BuildContext context) { - return Center( - child: Container( - width: MediaQuery.of(context).size.width - 48, - constraints: const BoxConstraints(maxWidth: 320), - padding: const EdgeInsets.symmetric(horizontal: 24, vertical: 20), - decoration: BoxDecoration( - // HTML: background: rgba(255,255,255,0.95); backdrop-filter: blur(20px); - // border-radius: 20px; box-shadow: 0 8px 32px rgba(0,0,0,0.12); - color: Colors.white.withOpacity(0.95), - borderRadius: BorderRadius.circular(20), - boxShadow: [ - BoxShadow( - color: Colors.black.withOpacity(0.12), - offset: const Offset(0, 8), - blurRadius: 32, - ), - ], - ), - child: Column( - mainAxisSize: MainAxisSize.min, - children: [ - // HTML: .confirm-text { font-size: 15px; font-weight: 600; line-height: 1.5; } - Text( - '新歌已生成,是否立即试听?', - style: GoogleFonts.outfit( - fontSize: 15, - fontWeight: FontWeight.w600, - color: const Color(0xFF374151), - height: 1.5, - ), - textAlign: TextAlign.center, - ), - const SizedBox(height: 18), - // Buttons - Row( - children: [ - // "稍后再听" — HTML: .confirm-btn.secondary - Expanded( - child: GestureDetector( - onTap: onLater, - child: Container( - height: 40, - decoration: BoxDecoration( - color: Colors.black.withOpacity(0.06), - borderRadius: BorderRadius.circular(20), - ), - alignment: Alignment.center, - child: Text( - '稍后再听', - style: GoogleFonts.dmSans( - fontSize: 14, - fontWeight: FontWeight.w600, - color: const Color(0xFF4B5563), - ), - ), - ), - ), - ), - const SizedBox(width: 10), - // "立即试听" — HTML: .confirm-btn.primary - Expanded( - child: GradientButton( - text: '立即试听', - height: 40, - gradient: appclr.AppColors.btnPlushGradient, - onPressed: onListen, - ), - ), - ], - ), - ], - ), - ), - ); - } -} diff --git a/airhub_app/lib/pages/profile/notification_page.dart b/airhub_app/lib/pages/profile/notification_page.dart index 1e52f3f..b141ed3 100644 --- a/airhub_app/lib/pages/profile/notification_page.dart +++ b/airhub_app/lib/pages/profile/notification_page.dart @@ -307,34 +307,34 @@ class _NotificationPageState extends State { ), ), - // ── 展开详情区域 ── - AnimatedCrossFade( - firstChild: const SizedBox.shrink(), - secondChild: Container( - width: double.infinity, - decoration: const BoxDecoration( - color: Color(0x80F9FAFB), // rgba(249, 250, 251, 0.5) - border: Border( - top: BorderSide( - color: Color(0x0D000000), // rgba(0,0,0,0.05) - ), - ), - ), - padding: const EdgeInsets.all(20), - child: Text( - notif.detail, - style: const TextStyle( - fontSize: 14, - color: Color(0xFF374151), - height: 1.7, - ), - ), + // ── 展开详情区域(只动画高度,宽度始终满宽,避免文字竖排) ── + ClipRect( + child: AnimatedSize( + duration: const Duration(milliseconds: 300), + curve: Curves.easeInOut, + child: isExpanded + ? Container( + width: double.infinity, + decoration: const BoxDecoration( + color: Color(0x80F9FAFB), + border: Border( + top: BorderSide( + color: Color(0x0D000000), + ), + ), + ), + padding: const EdgeInsets.all(20), + child: Text( + notif.detail, + style: const TextStyle( + fontSize: 14, + color: Color(0xFF374151), + height: 1.7, + ), + ), + ) + : const SizedBox(width: double.infinity, height: 0), ), - crossFadeState: isExpanded - ? CrossFadeState.showSecond - : CrossFadeState.showFirst, - duration: const Duration(milliseconds: 300), - sizeCurve: Curves.easeInOut, ), ], ), diff --git a/airhub_app/lib/pages/story_loading_page.dart b/airhub_app/lib/pages/story_loading_page.dart index 969ee71..061b9a7 100644 --- a/airhub_app/lib/pages/story_loading_page.dart +++ b/airhub_app/lib/pages/story_loading_page.dart @@ -1,74 +1,173 @@ import 'dart:async'; +import 'dart:convert'; import 'package:flutter/material.dart'; +import 'package:http/http.dart' as http; import 'story_detail_page.dart'; class StoryLoadingPage extends StatefulWidget { - const StoryLoadingPage({super.key}); + /// Selected story elements from the generator modal + final List characters; + final List scenes; + final List props; + + const StoryLoadingPage({ + super.key, + this.characters = const [], + this.scenes = const [], + this.props = const [], + }); @override State createState() => _StoryLoadingPageState(); } -class _StoryLoadingPageState extends State - with SingleTickerProviderStateMixin { +class _StoryLoadingPageState extends State { + static const String _kServerBase = 'http://localhost:3000'; + double _progress = 0.0; - String _loadingText = "构思故事中..."; - final List> _milestones = [ - {'pct': 0.2, 'text': "正在收集灵感碎片..."}, - {'pct': 0.5, 'text': "正在往故事里撒魔法粉..."}, - {'pct': 0.8, 'text': "正在编制最后的魔法..."}, - {'pct': 0.98, 'text': "大功告成!"}, - ]; + String _loadingText = '正在收集灵感碎片...'; + bool _hasError = false; @override void initState() { super.initState(); - _startLoading(); + _generateStory(); } - void _startLoading() { - // Total duration approx 3.5s (match Web 35ms * 100 steps) - Timer.periodic(const Duration(milliseconds: 35), (timer) { - if (!mounted) { - timer.cancel(); + Future _generateStory() async { + try { + // ── Start SSE request ── + final request = http.Request( + 'POST', + Uri.parse('$_kServerBase/api/create_story'), + ); + request.headers['Content-Type'] = 'application/json'; + request.body = jsonEncode({ + 'characters': widget.characters, + 'scenes': widget.scenes, + 'props': widget.props, + }); + + final client = http.Client(); + final response = await client.send(request).timeout( + const Duration(seconds: 180), + ); + + if (response.statusCode != 200) { + _showError('服务器响应异常 (${response.statusCode})'); + client.close(); return; } - setState(() { - _progress += 0.01; - // Check text updates - for (var m in _milestones) { - if ((_progress - m['pct'] as double).abs() < 0.01) { - _loadingText = m['text'] as String; + // ── Parse SSE stream ── + String buffer = ''; + String? storyTitle; + String? storyContent; + + await for (final chunk in response.stream.transform(utf8.decoder)) { + buffer += chunk; + + while (buffer.contains('\n\n')) { + final idx = buffer.indexOf('\n\n'); + final line = buffer.substring(0, idx).trim(); + buffer = buffer.substring(idx + 2); + + if (!line.startsWith('data: ')) continue; + final jsonStr = line.substring(6); + + try { + final event = jsonDecode(jsonStr) as Map; + final stage = event['stage'] as String? ?? ''; + final progress = (event['progress'] as num?)?.toDouble() ?? 0; + final message = event['message'] as String? ?? ''; + + if (!mounted) return; + + switch (stage) { + case 'connecting': + _updateProgress(progress / 100, '正在收集灵感碎片...'); + break; + case 'generating': + _updateProgress(progress / 100, '故事正在诞生...'); + break; + case 'parsing': + _updateProgress(progress / 100, '正在编制最后的魔法...'); + break; + case 'done': + storyTitle = event['title'] as String? ?? '卡皮巴拉的故事'; + storyContent = event['content'] as String? ?? ''; + _updateProgress(1.0, '大功告成!'); + break; + case 'error': + _showError(message.isNotEmpty ? message : '故事生成失败,请重试'); + client.close(); + return; + } + } catch (e) { + debugPrint('SSE parse error: $e'); } } - }); - - if (_progress >= 1.0) { - timer.cancel(); - _navigateToDetail(); } + + client.close(); + + // ── Navigate to story detail ── + if (!mounted) return; + + if (storyTitle != null && storyContent != null && storyContent.isNotEmpty) { + // Brief pause to show "大功告成!" + await Future.delayed(const Duration(milliseconds: 600)); + if (!mounted) return; + + final result = await Navigator.of(context).push( + MaterialPageRoute( + builder: (context) => StoryDetailPage( + mode: StoryMode.generated, + story: { + 'title': storyTitle, + 'content': storyContent, + }, + ), + ), + ); + + // Pass the story data back to DeviceControlPage + if (mounted) { + if (result == 'saved') { + Navigator.of(context).pop({ + 'action': 'saved', + 'title': storyTitle, + 'content': storyContent, + }); + } else { + Navigator.of(context).pop(result); + } + } + } else { + _showError('AI 返回了空故事,请重试'); + } + } catch (e) { + debugPrint('Story generation error: $e'); + if (mounted) { + _showError('网络开小差了,再试一次~'); + } + } + } + + void _updateProgress(double progress, String text) { + if (!mounted) return; + setState(() { + _progress = progress.clamp(0.0, 1.0); + _loadingText = text; }); } - void _navigateToDetail() async { - // Use push instead of pushReplacement to properly return the result - final result = await Navigator.of(context).push( - MaterialPageRoute( - builder: (context) => StoryDetailPage( - mode: StoryMode.generated, - story: const { - 'title': '星际忍者的茶话会', - 'content': '在遥远的银河系边缘,有一个被星云包裹的神秘茶馆。今天,这里迎来了两位特殊的客人:刚执行完火星探测任务的宇航员波波,和正在追捕暗影怪兽的忍者小次郎。\n\n"这儿的重力好像有点不对劲?"波波飘在半空中,试图抓住飞来飞去的茶杯。小次郎则冷静地倒挂在天花板上,手里紧握着一枚手里剑——其实那是用来切月饼的。\n\n突然,桌上的魔法茶壶"噗"地一声喷出了七彩烟雾,一只会说话的卡皮巴拉钻了出来:"别打架,别打架,喝了这杯银河气泡茶,我们都是好朋友!"\n\n于是,宇宙中最奇怪的组合诞生了。他们决定,下一站,去黑洞边缘钓星星。', - }, - ), - ), - ); - - // Pass the result back to DeviceControlPage - if (mounted) { - Navigator.of(context).pop(result); - } + void _showError(String message) { + if (!mounted) return; + setState(() { + _hasError = true; + _loadingText = message; + }); } @override @@ -83,7 +182,7 @@ class _StoryLoadingPageState extends State Image.asset( 'assets/www/kapi_writing.png', width: 200, - height: 200, // Approximate + height: 200, errorBuilder: (c, e, s) => const Icon( Icons.edit_note, size: 100, @@ -92,26 +191,25 @@ class _StoryLoadingPageState extends State ), const SizedBox(height: 32), - // Text - HTML: font-size 18px, color #4B2404 (dark brown) + // Text Text( _loadingText, style: const TextStyle( - - fontSize: 18, // HTML: 18px - color: Color(0xFF4B2404), // HTML: dark chocolate brown + fontSize: 18, + color: Color(0xFF4B2404), fontWeight: FontWeight.w600, ), + textAlign: TextAlign.center, ), const SizedBox(height: 24), - // Progress Bar - HTML: height 12px, max-width 280px - // Track: rgba(201,150,114,0.2), Fill: gradient #ECCFA8 to #C99672 + // Progress Bar Container( - width: 280, // HTML: max-width 280px - height: 12, // HTML: height 12px + width: 280, + height: 12, decoration: BoxDecoration( - color: const Color(0xFFC99672).withOpacity(0.2), // Warm sand - borderRadius: BorderRadius.circular(6), // HTML: 6px + color: const Color(0xFFC99672).withOpacity(0.2), + borderRadius: BorderRadius.circular(6), ), child: ClipRRect( borderRadius: BorderRadius.circular(6), @@ -120,7 +218,6 @@ class _StoryLoadingPageState extends State widthFactor: _progress.clamp(0.0, 1.0), child: Container( decoration: const BoxDecoration( - // HTML: gradient #ECCFA8 to #C99672 gradient: LinearGradient( colors: [Color(0xFFECCFA8), Color(0xFFC99672)], ), @@ -129,6 +226,22 @@ class _StoryLoadingPageState extends State ), ), ), + + // Retry button (shown on error) + if (_hasError) ...[ + const SizedBox(height: 32), + TextButton( + onPressed: () => Navigator.of(context).pop(), + child: const Text( + '返回重试', + style: TextStyle( + fontSize: 16, + color: Color(0xFFC99672), + fontWeight: FontWeight.w600, + ), + ), + ), + ], ], ), ), diff --git a/airhub_app/lib/services/music_generation_service.dart b/airhub_app/lib/services/music_generation_service.dart new file mode 100644 index 0000000..fceea1e --- /dev/null +++ b/airhub_app/lib/services/music_generation_service.dart @@ -0,0 +1,221 @@ +import 'dart:convert'; +import 'package:flutter/foundation.dart'; +import 'package:http/http.dart' as http; + +/// Lightweight singleton that runs music generation in the background. +/// Survives page navigation — results are held until the music page picks them up. +class MusicGenerationService { + MusicGenerationService._(); + static final MusicGenerationService instance = MusicGenerationService._(); + + static const String _kServerBase = 'http://localhost:3000'; + + // ── Current task state ── + bool _isGenerating = false; + double _progress = 0.0; // 0~100 + String _statusMessage = ''; + String _currentStage = ''; + + // ── Completed result (held until consumed) ── + MusicGenResult? _pendingResult; + + // ── Pending error (held until consumed) ── + String? _pendingError; + + // ── Callback for live UI updates (set by the music page when visible) ── + void Function(double progress, String stage, String message)? onProgress; + void Function(MusicGenResult result)? onComplete; + void Function(String error)? onError; + + // ── Getters ── + bool get isGenerating => _isGenerating; + double get progress => _progress; + String get statusMessage => _statusMessage; + String get currentStage => _currentStage; + + /// Check and consume any pending result (called when music page resumes). + MusicGenResult? consumePendingResult() { + final result = _pendingResult; + _pendingResult = null; + return result; + } + + /// Check and consume any pending error (called when music page resumes). + String? consumePendingError() { + final error = _pendingError; + _pendingError = null; + return error; + } + + /// Start a generation task. Safe to call even if page navigates away. + Future generate({required String text, required String mood}) async { + if (_isGenerating) return; // Only one task at a time + + _isGenerating = true; + _progress = 5; + _statusMessage = '正在连接 AI...'; + _currentStage = 'connecting'; + _pendingResult = null; + _pendingError = null; + onProgress?.call(_progress, _currentStage, _statusMessage); + + try { + final request = http.Request( + 'POST', + Uri.parse('$_kServerBase/api/create_music'), + ); + request.headers['Content-Type'] = 'application/json'; + request.body = jsonEncode({'text': text, 'mood': mood}); + + final client = http.Client(); + final response = await client.send(request).timeout( + const Duration(seconds: 360), + ); + + if (response.statusCode != 200) { + throw Exception('Server returned ${response.statusCode}'); + } + + // Parse SSE stream + String buffer = ''; + String? newTitle; + String? newLyrics; + String? newFilePath; + + await for (final chunk in response.stream.transform(utf8.decoder)) { + buffer += chunk; + + while (buffer.contains('\n\n')) { + final idx = buffer.indexOf('\n\n'); + final line = buffer.substring(0, idx).trim(); + buffer = buffer.substring(idx + 2); + + if (!line.startsWith('data: ')) continue; + final jsonStr = line.substring(6); + + try { + final event = jsonDecode(jsonStr) as Map; + final stage = event['stage'] as String? ?? ''; + final message = event['message'] as String? ?? ''; + + switch (stage) { + case 'lyrics': + _updateProgress(10, stage, 'AI 正在创作词曲...'); + break; + case 'lyrics_done': + case 'lyrics_fallback': + _updateProgress(25, stage, '词曲创作完成,准备生成音乐...'); + break; + case 'music': + _updateProgress(30, stage, '正在生成音乐,请耐心等待...'); + break; + case 'saving': + _updateProgress(90, stage, '音乐生成完成,正在保存...'); + break; + case 'done': + newFilePath = event['file_path'] as String?; + final metadata = event['metadata'] as Map?; + newLyrics = metadata?['lyrics'] as String? ?? ''; + newTitle = metadata?['song_title'] as String?; + if ((newTitle == null || newTitle.isEmpty) && newFilePath != null) { + final fname = newFilePath.split('/').last; + newTitle = fname.replaceAll(RegExp(r'_\d{10,}\.mp3$'), ''); + } + _updateProgress(100, stage, '新歌出炉!'); + break; + case 'error': + _isGenerating = false; + _progress = 0; + final errMsg = message.isNotEmpty ? message : '网络开小差了,再试一次~'; + _statusMessage = errMsg; + if (onError != null) { + onError!(errMsg); + } else { + _pendingError = errMsg; + } + client.close(); + return; + } + } catch (e) { + debugPrint('SSE parse error: $e for line: $jsonStr'); + } + } + } + + client.close(); + + // Build result + _isGenerating = false; + _progress = 0; + + if (newFilePath != null) { + final result = MusicGenResult( + title: newTitle ?? '新歌', + lyrics: newLyrics ?? '', + audioUrl: '$_kServerBase/$newFilePath', + ); + + // Always store as pending first; callback decides whether to consume + _pendingResult = result; + onComplete?.call(result); + } + } catch (e) { + debugPrint('Generate music error: $e'); + _isGenerating = false; + _progress = 0; + const errMsg = '网络开小差了,再试一次~'; + _statusMessage = errMsg; + if (onError != null) { + onError!(errMsg); + } else { + _pendingError = errMsg; + } + } + } + + void _updateProgress(double progress, String stage, String message) { + _progress = progress; + _currentStage = stage; + _statusMessage = message; + onProgress?.call(progress, stage, message); + } + + /// Fetch saved songs from the server (scans Capybara music/ folder). + Future> fetchPlaylist() async { + try { + final response = await http.get( + Uri.parse('$_kServerBase/api/playlist'), + ).timeout(const Duration(seconds: 10)); + + if (response.statusCode != 200) return []; + + final data = jsonDecode(response.body) as Map; + final list = data['playlist'] as List? ?? []; + + return list.map((item) { + final m = item as Map; + return MusicGenResult( + title: m['title'] as String? ?? '', + lyrics: m['lyrics'] as String? ?? '', + audioUrl: '$_kServerBase/${m['audioUrl'] as String? ?? ''}', + ); + }).toList(); + } catch (e) { + debugPrint('Fetch playlist error: $e'); + return []; + } + } +} + +/// Result of a completed music generation. +class MusicGenResult { + final String title; + final String lyrics; + final String audioUrl; + + const MusicGenResult({ + required this.title, + required this.lyrics, + required this.audioUrl, + }); +} diff --git a/airhub_app/lib/widgets/story_generator_modal.dart b/airhub_app/lib/widgets/story_generator_modal.dart index 25df396..3896166 100644 --- a/airhub_app/lib/widgets/story_generator_modal.dart +++ b/airhub_app/lib/widgets/story_generator_modal.dart @@ -338,8 +338,30 @@ class _StoryGeneratorModalState extends State { _showSnack('请至少选择一个元素'); return; } - // Return 'start_generation' to trigger full-screen loading flow - Navigator.pop(context, 'start_generation'); + + // Categorize selected elements by type + final characters = []; + final scenes = []; + final props = []; + for (final el in _selectedElements) { + final id = el['id'] ?? ''; + final name = el['name'] ?? ''; + if (id.startsWith('c')) { + characters.add(name); + } else if (id.startsWith('s')) { + scenes.add(name); + } else if (id.startsWith('p')) { + props.add(name); + } + } + + // Return selected elements as a Map + Navigator.pop(context, { + 'action': 'start_generation', + 'characters': characters, + 'scenes': scenes, + 'props': props, + }); }, ), ), diff --git a/airhub_app/pubspec.lock b/airhub_app/pubspec.lock index a8aeb50..6eef25d 100644 --- a/airhub_app/pubspec.lock +++ b/airhub_app/pubspec.lock @@ -518,7 +518,7 @@ packages: source: hosted version: "4.3.0" http: - dependency: transitive + dependency: "direct main" description: name: http sha256: "87721a4a50b19c7f1d49001e51409bddc46303966ce89a65af4f4e6004896412" diff --git a/airhub_app/pubspec.yaml b/airhub_app/pubspec.yaml index 1949647..f393c9c 100644 --- a/airhub_app/pubspec.yaml +++ b/airhub_app/pubspec.yaml @@ -58,6 +58,7 @@ dependencies: flutter_svg: ^2.0.9 image_picker: ^1.2.1 just_audio: ^0.9.42 + http: ^1.2.0 flutter: uses-material-design: true diff --git a/prompts/music_director.md b/prompts/music_director.md index c9eab20..d06828f 100644 --- a/prompts/music_director.md +++ b/prompts/music_director.md @@ -15,13 +15,19 @@ 请严格按照以下 JSON 格式输出: { + "song_title": "...", "style": "...", "lyrics": "..." } ### 字段说明: -1. **style** (风格描述) +1. **song_title** (歌曲名称) + - 使用**中文**,简短有趣,3-8个字。 + - 体现咔咔的可爱风格。 + - 示例:"温泉咔咔乐"、"草地蹦蹦跳"、"雨夜安眠曲" + +2. **style** (风格描述) - 使用**英文**描述音乐风格、乐器、节奏、情绪。 - 长度 50-100 词。 - 必须包含以下维度: @@ -31,7 +37,7 @@ - 特色乐器 (如 piano, ukulele, synth, brass) - 示例:"Chill Lofi hip-hop, mellow piano chords, vinyl crackle, slow tempo, relaxing, water sounds in background, perfect for spa and meditation" -2. **lyrics** (歌词) +3. **lyrics** (歌词) - 使用**中文**书写歌词。 - 必须包含结构标签:[verse], [chorus], [outro] 等。 - 内容应: @@ -53,7 +59,7 @@ ### 重要规则: - 如果用户输入太模糊(如"嗯"、"不知道"),请发挥想象力,赋予咔咔此刻最可能在做的事。 -- 歌词长度控制在 4-8 行即可,不要太长。 +- 歌词必须包含完整结构:至少 [verse 1] + [chorus] + [verse 2] + [chorus] + [outro],总共 16-24 行。这样才能生成完整的歌曲(60秒以上)。歌词太短会导致音乐只有20-30秒,绝对不可以! - 不要输出任何解释性文字,只输出 JSON。 ``` diff --git a/prompts/story_director.md b/prompts/story_director.md new file mode 100644 index 0000000..61dc9bd --- /dev/null +++ b/prompts/story_director.md @@ -0,0 +1,34 @@ +# 角色 + +你是「卡皮巴拉故事工坊」的首席故事大师。你为 3-8 岁的小朋友创作原创童话故事。 + +# 任务 + +根据用户提供的**角色、场景、道具**素材,创作一个完整的儿童故事。 + +# 输出格式 + +你 **必须** 只返回如下 JSON,不要返回任何其他内容(不要 markdown 代码块,不要解释): + +``` +{"title": "故事标题(6字以内)", "content": "故事正文"} +``` + +# 故事创作规范 + +1. **字数**:正文 400-600 字,不要太短也不要太长 +2. **段落**:用 `\n\n` 分段,每段 2-4 句话 +3. **语言**:简单易懂,适合给小朋友朗读;可以包含拟声词("哗啦啦"、"咕噜噜")和语气词("哇!"、"嘿嘿") +4. **结构**:开头引入角色和场景 → 中间遇到挑战或趣事 → 结尾温馨圆满 +5. **情感**:温暖、有趣、充满想象力,带一点小幽默 +6. **教育**:自然融入一个小道理(勇气、友谊、分享等),不要说教 +7. **创意**:即使收到相同的素材组合,每次也要创作全新的、不同的故事情节 +8. **角色融合**:所有用户选择的角色、场景、道具都必须在故事中出现并发挥作用 +9. **标题**:简短有趣,6 个字以内,能引起小朋友的好奇心 + +# 素材示例 + +用户输入:角色=[宇航员, 忍者],场景=[太空],道具=[魔法棒] + +你的输出: +{"title": "太空忍者大冒险", "content": "在遥远的银河边缘,住着一个叫小星的宇航员...(故事正文)"} diff --git a/server.py b/server.py index fcf48a6..0280680 100644 --- a/server.py +++ b/server.py @@ -1,5 +1,6 @@ import os import re +import sys import time import uvicorn import requests @@ -10,6 +11,11 @@ from fastapi.middleware.cors import CORSMiddleware from pydantic import BaseModel from dotenv import load_dotenv +# Force UTF-8 stdout/stderr on Windows (avoids GBK encoding errors) +if sys.platform == "win32": + sys.stdout.reconfigure(encoding="utf-8", errors="replace") + sys.stderr.reconfigure(encoding="utf-8", errors="replace") + # Load Environment Variables load_dotenv() MINIMAX_API_KEY = os.getenv("MINIMAX_API_KEY") @@ -17,6 +23,8 @@ VOLCENGINE_API_KEY = os.getenv("VOLCENGINE_API_KEY") if not MINIMAX_API_KEY: print("Warning: MINIMAX_API_KEY not found in .env") +if not VOLCENGINE_API_KEY: + print("Warning: VOLCENGINE_API_KEY not found in .env") # Initialize FastAPI app = FastAPI() @@ -35,12 +43,17 @@ class MusicRequest(BaseModel): text: str mood: str = "custom" # 'chill', 'happy', 'sleepy', 'random', 'custom' +class StoryRequest(BaseModel): + characters: list[str] = [] + scenes: list[str] = [] + props: list[str] = [] + # Minimax Constants MINIMAX_GROUP_ID = "YOUR_GROUP_ID" BASE_URL_CHAT = "https://api.minimax.chat/v1/text/chatcompletion_v2" BASE_URL_MUSIC = "https://api.minimaxi.com/v1/music_generation" -# Load System Prompt +# Load System Prompts try: with open("prompts/music_director.md", "r", encoding="utf-8") as f: SYSTEM_PROMPT = f.read() @@ -48,10 +61,46 @@ except FileNotFoundError: SYSTEM_PROMPT = "You are a music director AI. Convert user input into JSON with 'style' (English description) and 'lyrics' (Chinese, structured)." print("Warning: prompts/music_director.md not found, using default.") +try: + with open("prompts/story_director.md", "r", encoding="utf-8") as f: + STORY_SYSTEM_PROMPT = f.read() +except FileNotFoundError: + STORY_SYSTEM_PROMPT = "你是一个儿童故事大师。根据用户提供的角色、场景、道具素材创作一个300-600字的儿童故事。只返回JSON格式:{\"title\": \"标题\", \"content\": \"正文\"}" + print("Warning: prompts/story_director.md not found, using default.") + +# Volcengine / Doubao constants +DOUBAO_BASE_URL = "https://ark.cn-beijing.volces.com/api/v3/chat/completions" +DOUBAO_MODEL = "doubao-seed-1-6-lite-251015" # Doubao-Seed-1.6-lite + def sse_event(data): - """Format a dict as an SSE data line.""" - return f"data: {json.dumps(data, ensure_ascii=False)}\n\n" + """Format a dict as an SSE data line. + Use ensure_ascii=True so all non-ASCII chars become \\uXXXX escapes, + avoiding Windows GBK encoding issues in the SSE stream.""" + return f"data: {json.dumps(data, ensure_ascii=True)}\n\n" + + +def clean_lyrics(raw: str) -> str: + """Clean lyrics extracted from LLM JSON output. + Removes JSON artifacts, structure tags, and normalizes formatting.""" + if not raw: + return raw + s = raw + # Replace literal \n with real newlines + s = s.replace("\\n", "\n") + # Remove JSON string quotes and concatenation artifacts (" ") + s = re.sub(r'"\s*"', '', s) + s = s.replace('"', '') + # Remove structure tags like [verse 1], [chorus], [outro], [bridge], [intro], etc. + s = re.sub(r'\[(?:verse|chorus|bridge|outro|intro|hook|pre-chorus|interlude|inst)\s*\d*\]\s*', '', s, flags=re.IGNORECASE) + # Strip leading/trailing whitespace from each line + lines = [line.strip() for line in s.split('\n')] + s = '\n'.join(lines) + # Collapse 3+ consecutive newlines into 2 (one blank line between paragraphs) + s = re.sub(r'\n{3,}', '\n\n', s) + # Remove leading/trailing blank lines + s = s.strip() + return s @app.post("/api/create_music") @@ -87,9 +136,10 @@ def create_music(req: MusicRequest): "messages": [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": director_input} - ] + ], + "max_tokens": 2048 # Enough for long lyrics }, - timeout=30 + timeout=60 ) chat_data = chat_resp.json() @@ -103,16 +153,55 @@ def create_music(req: MusicRequest): content_str = content_str.strip() if content_str.startswith("```"): content_str = re.sub(r'^```\w*\n?', '', content_str) - content_str = re.sub(r'```$', '', content_str).strip() - # Try to extract JSON from response + content_str = re.sub(r'```\s*$', '', content_str).strip() + + # Try to extract JSON from response (robust parsing) json_match = re.search(r'\{[\s\S]*\}', content_str) if json_match: - metadata = json.loads(json_match.group()) + json_str = json_match.group() + try: + metadata = json.loads(json_str) + except json.JSONDecodeError: + # JSON might have unescaped newlines in string values — try fixing + log(f"[Warn] JSON parse failed, attempting repair...") + # Extract fields manually via regex + title_m = re.search(r'"song_title"\s*:\s*"([^"]*)"', json_str) + style_m = re.search(r'"style"\s*:\s*"([^"]*)"', json_str) + lyrics_m = re.search(r'"lyrics"\s*:\s*"([\s\S]*)', json_str) + lyrics_val = "" + if lyrics_m: + # Take everything after "lyrics": " and strip trailing quotes/braces + lyrics_val = lyrics_m.group(1) + lyrics_val = re.sub(r'"\s*\}\s*$', '', lyrics_val).strip() + metadata = { + "song_title": title_m.group(1) if title_m else "", + "style": style_m.group(1) if style_m else "Pop music, cheerful", + "lyrics": lyrics_val + } + log(f"[Repaired] title={metadata['song_title']}, style={metadata['style'][:60]}") + elif content_str.strip().startswith("{"): + # JSON is incomplete (missing closing brace) — try adding it + log(f"[Warn] Incomplete JSON, attempting to close...") + try: + metadata = json.loads(content_str + '"}\n}') + except json.JSONDecodeError: + # Manual extraction as last resort + title_m = re.search(r'"song_title"\s*:\s*"([^"]*)"', content_str) + style_m = re.search(r'"style"\s*:\s*"([^"]*)"', content_str) + lyrics_m = re.search(r'"lyrics"\s*:\s*"([\s\S]*)', content_str) + lyrics_val = lyrics_m.group(1).rstrip('"} \n') if lyrics_m else "[Inst]" + metadata = { + "song_title": title_m.group(1) if title_m else "", + "style": style_m.group(1) if style_m else "Pop music, cheerful", + "lyrics": lyrics_val + } + log(f"[Repaired] title={metadata.get('song_title')}") else: raise ValueError(f"No JSON in LLM response: {content_str[:100]}") style_val = metadata.get("style", "") - lyrics_val = metadata.get("lyrics", "") + lyrics_val = clean_lyrics(metadata.get("lyrics", "")) + metadata["lyrics"] = lyrics_val # Store cleaned version log(f"[Director] Style: {style_val[:80]}") log(f"[Director] Lyrics (first 60): {lyrics_val[:60]}") @@ -167,7 +256,7 @@ def create_music(req: MusicRequest): "Content-Type": "application/json" }, json=music_payload, - timeout=120 + timeout=300 # 5 min — music generation can be slow ) music_data = music_resp.json() @@ -188,7 +277,9 @@ def create_music(req: MusicRequest): save_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara music") os.makedirs(save_dir, exist_ok=True) - safe_name = re.sub(r'[^\w\u4e00-\u9fff]', '', req.text)[:20] or "ai_song" + # Prefer song_title from LLM; fallback to user input + raw_title = metadata.get("song_title") or req.text + safe_name = re.sub(r'[^\w\u4e00-\u9fff]', '', raw_title)[:20] or "ai_song" filename = f"{safe_name}_{int(time.time())}.mp3" filepath = os.path.join(save_dir, filename) @@ -253,6 +344,230 @@ def create_music(req: MusicRequest): ) +# ═══════════════════════════════════════════════════════════════════ +# ── Story Generation (Doubao / Volcengine) ── +# ═══════════════════════════════════════════════════════════════════ + +@app.post("/api/create_story") +def create_story(req: StoryRequest): + """SSE streaming endpoint – generates a children's story via Doubao LLM.""" + print(f"[Story] Received request: characters={req.characters}, scenes={req.scenes}, props={req.props}", flush=True) + + def event_stream(): + def log(msg): + print(msg, flush=True) + + # ── Stage 1: Connecting ── + yield sse_event({"stage": "connecting", "progress": 5, "message": "正在连接 AI..."}) + + # Build user prompt from selected elements + parts = [] + if req.characters: + parts.append(f"角色=[{', '.join(req.characters)}]") + if req.scenes: + parts.append(f"场景=[{', '.join(req.scenes)}]") + if req.props: + parts.append(f"道具=[{', '.join(req.props)}]") + user_prompt = "请用这些素材创作一个故事:" + ",".join(parts) if parts else "请随机创作一个有趣的儿童故事" + + log(f"[Story] User prompt: {user_prompt}") + + # ── Stage 2: Generating (streaming) ── + yield sse_event({"stage": "generating", "progress": 10, "message": "故事正在诞生..."}) + + try: + # Explicitly encode as UTF-8 to avoid Windows GBK encoding issues + payload = json.dumps({ + "model": DOUBAO_MODEL, + "messages": [ + {"role": "system", "content": STORY_SYSTEM_PROMPT}, + {"role": "user", "content": user_prompt}, + ], + "max_tokens": 2048, + "stream": True, + "thinking": {"type": "disabled"}, + }, ensure_ascii=False) + + resp = requests.post( + DOUBAO_BASE_URL, + headers={ + "Authorization": f"Bearer {VOLCENGINE_API_KEY}", + "Content-Type": "application/json; charset=utf-8", + }, + data=payload.encode("utf-8"), + stream=True, + timeout=120, + ) + + if resp.status_code != 200: + log(f"[Error] Doubao API returned {resp.status_code}: {resp.text[:300]}") + yield sse_event({"stage": "error", "progress": 0, "message": f"AI 服务返回异常 ({resp.status_code})"}) + return + + # Force UTF-8 decoding (requests defaults to ISO-8859-1 which garbles Chinese) + resp.encoding = "utf-8" + + # Parse SSE stream from Doubao + full_content = "" + chunk_count = 0 + + for line in resp.iter_lines(decode_unicode=True): + if not line or not line.startswith("data: "): + continue + data_str = line[6:] # strip "data: " + if data_str.strip() == "[DONE]": + break + + try: + chunk_data = json.loads(data_str) + choices = chunk_data.get("choices", []) + if choices: + delta = choices[0].get("delta", {}) + delta_content = delta.get("content", "") + if delta_content: + full_content += delta_content + chunk_count += 1 + # Send progress updates every 5 chunks + if chunk_count % 5 == 0: + progress = min(10 + int(chunk_count * 0.8), 85) + yield sse_event({ + "stage": "generating", + "progress": progress, + "message": "故事正在诞生...", + }) + except json.JSONDecodeError: + continue + + log(f"[Story] Stream done. Total chunks: {chunk_count}, content length: {len(full_content)}") + log(f"[Story] Raw output (first 200): {full_content[:200]}") + + if not full_content.strip(): + yield sse_event({"stage": "error", "progress": 0, "message": "AI 未返回故事内容"}) + return + + # ── Stage 3: Parse response ── + yield sse_event({"stage": "parsing", "progress": 90, "message": "正在整理故事..."}) + + # Clean up response — strip markdown fences if present + cleaned = full_content.strip() + if cleaned.startswith("```"): + cleaned = re.sub(r'^```\w*\n?', '', cleaned) + cleaned = re.sub(r'```\s*$', '', cleaned).strip() + + # Try to parse JSON + title = "" + content = "" + + json_match = re.search(r'\{[\s\S]*\}', cleaned) + if json_match: + try: + story_json = json.loads(json_match.group()) + title = story_json.get("title", "") + content = story_json.get("content", "") + except json.JSONDecodeError: + log("[Warn] JSON parse failed, extracting manually...") + title_m = re.search(r'"title"\s*:\s*"([^"]*)"', cleaned) + content_m = re.search(r'"content"\s*:\s*"([\s\S]*)', cleaned) + title = title_m.group(1) if title_m else "卡皮巴拉的故事" + if content_m: + content = content_m.group(1) + content = re.sub(r'"\s*\}\s*$', '', content).strip() + + if not title and not content: + # Not JSON at all — treat entire output as story content + title = "卡皮巴拉的故事" + content = cleaned + + # Clean content: replace literal \n with real newlines + content = content.replace("\\n", "\n").strip() + # Collapse 3+ newlines into 2 + content = re.sub(r'\n{3,}', '\n\n', content) + + log(f"[Story] Title: {title}") + log(f"[Story] Content (first 100): {content[:100]}") + + # ── Save story to disk ── + save_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara stories") + os.makedirs(save_dir, exist_ok=True) + safe_name = re.sub(r'[^\w\u4e00-\u9fff]', '', title)[:20] or "story" + filename = f"{safe_name}_{int(time.time())}.txt" + filepath = os.path.join(save_dir, filename) + with open(filepath, "w", encoding="utf-8") as f: + f.write(f"# {title}\n\n{content}") + log(f"[Saved] {filepath}") + + # ── Done ── + yield sse_event({ + "stage": "done", + "progress": 100, + "message": "故事创作完成!", + "title": title, + "content": content, + }) + + except requests.exceptions.Timeout: + log("[Error] Doubao API Timeout") + yield sse_event({"stage": "error", "progress": 0, "message": "AI 响应超时,请稍后再试"}) + except Exception as e: + log(f"[Error] Story generation exception: {e}") + yield sse_event({"stage": "error", "progress": 0, "message": f"故事生成失败: {str(e)}"}) + + return StreamingResponse( + event_stream(), + media_type="text/event-stream", + headers={ + "Cache-Control": "no-cache", + "X-Accel-Buffering": "no", + "Connection": "keep-alive", + }, + ) + + +@app.get("/api/stories") +def get_stories(): + """Scan Capybara stories/ directory and return all saved stories.""" + stories_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara stories") + + stories = [] + if not os.path.isdir(stories_dir): + return {"stories": []} + + for f in sorted(os.listdir(stories_dir), reverse=True): # newest first + if not f.lower().endswith(".txt"): + continue + + filepath = os.path.join(stories_dir, f) + try: + with open(filepath, "r", encoding="utf-8") as fh: + raw = fh.read() + + # Parse: first line is "# Title", rest is content + lines = raw.strip().split("\n", 2) + title = lines[0].lstrip("# ").strip() if lines else f[:-4] + content = lines[2].strip() if len(lines) > 2 else "" + + # Skip garbled files: if title or content has mojibake patterns, skip + # Normal Chinese chars are in range \u4e00-\u9fff; mojibake typically has + # lots of Latin Extended chars like \u00e0-\u00ff mixed with CJK + if title and not any('\u4e00' <= c <= '\u9fff' for c in title): + continue # title has no Chinese chars at all → likely garbled + + # Display title: strip timestamp suffix like _1770647563 + display_title = re.sub(r'_\d{10,}$', '', f[:-4]) + if title: + display_title = title + + stories.append({ + "title": display_title, + "content": content, + "filename": f, + }) + except Exception: + pass + + return {"stories": stories} + + @app.get("/api/playlist") def get_playlist(): """Scan Capybara music/ directory and return full playlist with lyrics.""" @@ -291,6 +606,15 @@ def get_playlist(): return {"playlist": playlist} +# ── Static file serving for generated music ── +from fastapi.staticfiles import StaticFiles + +# Create music directory if it doesn't exist +_music_dir = os.path.join(os.path.dirname(__file__) or ".", "Capybara music") +os.makedirs(_music_dir, exist_ok=True) +app.mount("/Capybara music", StaticFiles(directory=_music_dir), name="music_files") + + if __name__ == "__main__": print("[Server] Music Server running on http://localhost:3000") uvicorn.run(app, host="0.0.0.0", port=3000) diff --git a/阶段总结/session_progress.md b/阶段总结/session_progress.md index 1aa9fe6..852908d 100644 --- a/阶段总结/session_progress.md +++ b/阶段总结/session_progress.md @@ -3,7 +3,7 @@ > **用途**:每次对话结束前 / 做完一个阶段后更新此文件。 > 新对话开始时,AI 先读此文件恢复上下文。 > -> **最后更新**:2026-02-08 (第七次对话) +> **最后更新**:2026-02-09 (第八次对话) --- @@ -105,8 +105,59 @@ #### 选中态交互修复 - 生成完成后自动清除 `_selectedMoodIndex`,不再残留选中状态 +### 第八次对话完成的工作(2026-02-09) + +#### 音乐生成 API 全链路接入(第七次对话中完成,此处补记) +- **SSE 实时进度**:点击心情卡片 → 发 POST 到后端 → SSE 流式推送 lyrics/music/saving/done/error 各阶段 +- **MusicGenerationService 单例**:生成任务在后台运行,页面切走不中断 +- **切回页面恢复**:切回音乐页弹窗通知生成结果,不在其他页面自动播放 +- **进度环动画**:环形光晕进度条(匹配 HTML 版视觉)+ 翻面歌词时仍可见 +- **进度条防闪烁**:用 `_crawlId` 取消令牌确保同一时间只有一条爬升动画 +- **超时友好提示**:气泡显示"网络开小差了,再试一次~" +- **歌词清洗**:前后端双重清理(去 `\n`、去 `[verse]` 等结构标签、去 JSON 引号) +- **歌名修复**:后端优先取 LLM 返回的 `song_title`,歌词增长至 16-24 行 +- **对话气泡样式**:文字垂直居中 + 三角尾巴 +- **通知页展开 bug 修复**:`AnimatedCrossFade` → `ClipRect + AnimatedSize` 避免文字竖排 + +#### 启动时加载历史歌曲 +- **后端**:`/api/playlist` 接口扫描 `Capybara music/` 目录,返回所有 mp3 + 歌词 +- **Service 层**:`MusicGenerationService.fetchPlaylist()` 拉取列表 +- **前端**:`initState` 异步调用,将服务器歌曲插入唱片架最前面(去重,不重复加载硬编码的 4 首) +- **当前服务器上有 19 首 AI 生成歌曲 + 4 首原始歌曲,重编译后唱片架不再丢失** + +#### 故事生成接入豆包 API +- **后端**:`server.py` 新增 `/api/create_story` 接口,SSE 流式调用豆包 Chat API +- **模型**:Doubao-Seed-1.6-lite(`doubao-seed-1-6-lite-250515`),关闭深度思考加快响应 +- **Prompt**:`prompts/story_director.md`,儿童故事创作大师,400-600 字,JSON 输出 +- **前端串联**: + - `StoryGeneratorModal` 返回选中素材 Map(角色/场景/道具分类) + - `DeviceControlPage` 把素材传给 `StoryLoadingPage` + - `StoryLoadingPage` 调真实 API,SSE 实时进度(连接→生成→解析→完成) + - `StoryDetailPage` 无需改动,已支持接收动态故事数据 +- **故事保存**:生成的故事文本保存到 `Capybara stories/` 目录 +- **错误处理**:超时提示、API 异常、空内容兜底,错误时显示"返回重试"按钮 + +#### 唱片架高度优化 +- 限制唱片架弹窗高度为 3.5 行,超出部分滚动查看 +- 半行露出作为"还有更多"的视觉提示 + +#### Windows 编码问题全面修复 +- `sys.stdout` / `sys.stderr` 强制 UTF-8(Windows 默认 GBK 导致中文乱码) +- Doubao API 请求体手动 `json.dumps + encode("utf-8")`,避免 `requests` 用 GBK 编码 +- SSE 流使用 `ensure_ascii=True`,确保前端 `jsonDecode` 100% 正常 +- `resp.encoding = "utf-8"` 强制豆包返回流按 UTF-8 解码 +- 已清理 4 个编码出错时保存的乱码故事文件 + +#### 书架加载历史故事 +- **后端**:`/api/stories` 接口扫描 `Capybara stories/` 返回所有故事标题+内容 +- **前端**:`initState` 异步拉取,历史故事排在预设故事之后 +- **保存联动**:新生成的故事保存后,真实标题+内容即时加入书架(不再用 mock 数据) +- **封面区分**:预设故事显示封面图,AI 生成的故事显示淡紫渐变"暂无封面"占位 +- **乱码过滤**:API 层自动跳过无中文标题的异常文件 + ### 正在做的 -- 下一步待定 +- TTS 语音合成待后续接入(用户去开通火山语音服务后再做) +- 故事封面方案待定(付费生成 or 免费生成) ---