Merge branch 'harry0703:main' into main

This commit is contained in:
cpanel10x 2024-04-16 15:28:30 +07:00 committed by GitHub
commit 7e8c901fd4
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
24 changed files with 619 additions and 202 deletions

View File

@ -21,3 +21,4 @@ __pycache__/
.svn/
storage/
config.toml

4
.gitignore vendored
View File

@ -9,4 +9,6 @@
/app/utils/__pycache__/
/*/__pycache__/*
.vscode
/**/.streamlit
/**/.streamlit
__pycache__
logs/

View File

@ -4,7 +4,7 @@ FROM python:3.10-slim
# Set the working directory in the container
WORKDIR /MoneyPrinterTurbo
ENV PYTHONPATH="/MoneyPrinterTurbo:$PYTHONPATH"
ENV PYTHONPATH="/MoneyPrinterTurbo"
# Install system dependencies
RUN apt-get update && apt-get install -y \
@ -17,11 +17,7 @@ RUN apt-get update && apt-get install -y \
RUN sed -i '/<policy domain="path" rights="none" pattern="@\*"/d' /etc/ImageMagick-6/policy.xml
# Copy the current directory contents into the container at /MoneyPrinterTurbo
COPY ./app ./app
COPY ./webui ./webui
COPY ./resource ./resource
COPY ./requirements.txt ./requirements.txt
COPY ./main.py ./main.py
COPY . .
# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt
@ -30,8 +26,13 @@ RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 8501
# Command to run the application
CMD ["streamlit", "run", "./webui/Main.py","--browser.serverAddress=0.0.0.0","--server.enableCORS=True","--browser.gatherUsageStats=False"]
CMD ["streamlit", "run", "./webui/Main.py","--browser.serverAddress=127.0.0.1","--server.enableCORS=True","--browser.gatherUsageStats=False"]
# At runtime, mount the config.toml file from the host into the container
# using Docker volumes. Example usage:
# docker run -v ./config.toml:/MoneyPrinterTurbo/config.toml -v ./storage:/MoneyPrinterTurbo/storage -p 8501:8501 moneyprinterturbo
# 1. Build the Docker image using the following command
# docker build -t moneyprinterturbo .
# 2. Run the Docker container using the following command
## For Linux or MacOS:
# docker run -v $(pwd)/config.toml:/MoneyPrinterTurbo/config.toml -v $(pwd)/storage:/MoneyPrinterTurbo/storage -p 8501:8501 moneyprinterturbo
## For Windows:
# docker run -v %cd%/config.toml:/MoneyPrinterTurbo/config.toml -v %cd%/storage:/MoneyPrinterTurbo/storage -p 8501:8501 moneyprinterturbo

View File

@ -10,9 +10,9 @@
<h3>English | <a href="README.md">简体中文</a></h3>
> Thanks to [RootFTW](https://github.com/Root-FTW) for the translation
<div align="center">
<a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FMoneyPrinterTurbo | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</div>
Simply provide a <b>topic</b> or <b>keyword</b> for a video, and it will automatically generate the video copy, video
materials, video subtitles, and video background music before synthesizing a high-definition short video.
@ -59,8 +59,7 @@ https://reccloud.com
- [x] Supports integration with various models such as **OpenAI**, **moonshot**, **Azure**, **gpt4free**, **one-api**,
**qianwen**, **Google Gemini**, **Ollama** and more
❓[How to Use the Free OpenAI GPT-3.5 Model?](https://github.com/harry0703/MoneyPrinterTurbo/blob/main/README-en.md#common-questions-)
❓[How to Use the Free OpenAI GPT-3.5 Model?](https://github.com/harry0703/MoneyPrinterTurbo/blob/main/README-en.md#common-questions-)
### Future Plans 📅
@ -261,12 +260,16 @@ own fonts.
## Common Questions 🤔
### ❓How to Use the Free OpenAI GPT-3.5 Model?
[OpenAI has announced that ChatGPT with 3.5 is now free](https://openai.com/blog/start-using-chatgpt-instantly), and developers have wrapped it into an API for direct usage.
[OpenAI has announced that ChatGPT with 3.5 is now free](https://openai.com/blog/start-using-chatgpt-instantly), and
developers have wrapped it into an API for direct usage.
**Ensure you have Docker installed and running**. Execute the following command to start the Docker service:
```shell
docker run -p 3040:3040 missuo/freegpt35
```
Once successfully started, modify the `config.toml` configuration as follows:
- Set `llm_provider` to `openai`

View File

@ -9,6 +9,9 @@
</p>
<br>
<h3>简体中文 | <a href="README-en.md">English</a></h3>
<div align="center">
<a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FMoneyPrinterTurbo | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</div>
<br>
只需提供一个视频 <b>主题</b><b>关键词</b> ,就可以全自动生成视频文案、视频素材、视频字幕、视频背景音乐,然后合成一个高清的短视频。
<br>
@ -26,7 +29,6 @@
## 特别感谢 🙏
由于该项目的 **部署****使用**,对于一些小白用户来说,还是 **有一定的门槛**,在此特别感谢
**录咖AI智能 多媒体服务平台)** 网站基于该项目,提供的免费`AI视频生成器`服务,可以不用部署,直接在线使用,非常方便。
- 中文版https://reccloud.cn
@ -34,6 +36,14 @@
![](docs/reccloud.cn.jpg)
## 感谢赞助 🙏
感谢佐糖 https://picwish.cn 对该项目的支持和赞助,使得该项目能够持续的更新和维护。
佐糖专注于**图像处理领域**,提供丰富的**图像处理工具**,将复杂操作极致简化,真正实现让图像处理更简单。
![picwish.jpg](docs/picwish.jpg)
## 功能特性 🎯
- [x] 完整的 **MVC架构**,代码 **结构清晰**,易于维护,支持 `API``Web界面`
@ -50,7 +60,8 @@
- [x] 视频素材来源 **高清**,而且 **无版权**
- [x] 支持 **OpenAI**、**moonshot**、**Azure**、**gpt4free**、**one-api**、**通义千问**、**Google Gemini**、**Ollama** 等多种模型接入
❓[如何使用免费的 **OpenAI GPT-3.5** 模型?](https://github.com/harry0703/MoneyPrinterTurbo?tab=readme-ov-file#%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98-)
❓[如何使用免费的 **OpenAI GPT-3.5
** 模型?](https://github.com/harry0703/MoneyPrinterTurbo?tab=readme-ov-file#%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98-)
### 后期计划 📅
@ -59,13 +70,26 @@
- [ ] 增加视频转场效果,使其看起来更加的流畅
- [ ] 增加更多视频素材来源,优化视频素材和文案的匹配度
- [ ] 增加视频长度选项:短、中、长
- [ ] 打包成一键启动包WindowsmacOS方便使用
- [ ] 增加免费网络代理让访问OpenAI和素材下载不再受限
- [ ] 可以使用自己的素材
- [ ] 朗读声音和背景音乐,提供实时试听
- [ ] 支持更多的语音合成服务商,比如 OpenAI TTS, Azure TTS
- [ ] 支持更多的语音合成服务商,比如 OpenAI TTS
- [ ] 自动上传到YouTube平台
## 交流讨论 💬
<img src="docs/wechat-03.jpg" width="150">
## 更新日志
### 2024-04-16 v1.1.2
- 支持azure新发布的9种语音合成声音需要配置API
KEY [9个更真实的AI对话声音](https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/9-more-realistic-ai-voices-for-conversations-now-generally/ba-p/4099471)
- 优化字幕显示
- 修复内存泄露问题
- 一些其他的bug修复和优化
## 视频演示 📺
### 竖屏 9:16
@ -74,12 +98,14 @@
<thead>
<tr>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> 《如何增加生活的乐趣》</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> 《金钱的作用》<br>更真实的合成声音</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> 《生命的意义是什么》</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/a84d33d5-27a2-4aba-8fd0-9fb2bd91c6a6"></video></td>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/af2f3b0b-002e-49fe-b161-18ba91c055e8"></video></td>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/112c9564-d52b-4472-99ad-970b75f66476"></video></td>
</tr>
</tbody>
@ -102,8 +128,28 @@
</tbody>
</table>
## 配置要求 📦
- 建议最低 CPU 4核或以上内存 8G 或以上,显卡非必须
- Windows 10 或 MacOS 11.0 以上系统
## 快速开始 🚀
下载一键启动包,解压直接使用
### Windows
- 百度网盘: https://pan.baidu.com/s/1bpGjgQVE5sADZRn3A6F87w?pwd=xt16 提取码: xt16
下载后,建议先**双击执行** `update.bat` 更新到**最新代码**,然后双击 `start.bat` 启动Web界面
### 其他系统
还没有制作一键启动包,看下面的 **安装部署** 部分,建议使用 **docker** 部署,更加方便。
## 安装部署 📥
### 前提条件
- 尽量不要使用 **中文路径**,避免出现一些无法预料的问题
- 请确保你的 **网络** 是正常的VPN需要打开`全局流量`模式
@ -134,6 +180,7 @@ git clone https://github.com/harry0703/MoneyPrinterTurbo.git
如果未安装 Docker请先安装 https://www.docker.com/products/docker-desktop/
如果是Windows系统请参考微软的文档
1. https://learn.microsoft.com/zh-cn/windows/wsl/install
2. https://learn.microsoft.com/zh-cn/windows/wsl/tutorials/wsl-containers
@ -212,6 +259,7 @@ webui.bat
conda activate MoneyPrinterTurbo
sh webui.sh
```
启动后,会自动打开浏览器
#### ④ 启动API服务 🚀
@ -226,21 +274,45 @@ python main.py
所有支持的声音列表,可以查看:[声音列表](./docs/voice-list.txt)
2024-04-16 v1.1.2 新增了9种Azure的语音合成声音需要配置API KEY该声音合成的更加真实。
## 字幕生成 📜
当前支持2种字幕生成方式
- edge: 生成速度快,性能更好,对电脑配置没有要求,但是质量可能不稳定
- whisper: 生成速度慢,性能较差,对电脑配置有一定要求,但是质量更可靠。
- **edge**: 生成`速度快`,性能更好,对电脑配置没有要求,但是质量可能不稳定
- **whisper**: 生成`速度慢`,性能较差,对电脑配置有一定要求,但是`质量更可靠`
可以修改 `config.toml` 配置文件中的 `subtitle_provider` 进行切换
建议使用 `edge` 模式,如果生成的字幕质量不好,再切换到 `whisper` 模式
> 注意:
1. whisper 模式下需要到 HuggingFace 下载一个模型文件,大约 3GB 左右,请确保网络通畅
2. 如果留空,表示不生成字幕。
> 由于国内无法访问 HuggingFace可以使用以下方法下载 `whisper-large-v3` 的模型文件
下载地址:
- 百度网盘: https://pan.baidu.com/s/11h3Q6tsDtjQKTjUu3sc5cA?pwd=xjs9
- 夸克网盘https://pan.quark.cn/s/3ee3d991d64b
模型下载后解压,整个目录放到 `.\MoneyPrinterTurbo\models` 里面,
最终的文件路径应该是这样: `.\MoneyPrinterTurbo\models\whisper-large-v3`
```
MoneyPrinterTurbo
├─models
│ └─whisper-large-v3
│ config.json
│ model.bin
│ preprocessor_config.json
│ tokenizer.json
│ vocabulary.json
```
## 背景音乐 🎵
用于视频的背景音乐,位于项目的 `resource/songs` 目录下。
@ -253,19 +325,22 @@ python main.py
## 常见问题 🤔
### ❓如何使用免费的OpenAI GPT-3.5模型?
[OpenAI宣布ChatGPT里面3.5已经免费了](https://openai.com/blog/start-using-chatgpt-instantly)有开发者将其封装成了API可以直接调用
**确保你安装和启动了docker服务**执行以下命令启动docker服务
```shell
docker run -p 3040:3040 missuo/freegpt35
```
启动成功后,修改 `config.toml` 中的配置
- `llm_provider` 设置为 `openai`
- `openai_api_key` 随便填写一个即可,比如 '123456'
- `openai_base_url` 改为 `http://localhost:3040/v1/`
- `openai_model_name` 改为 `gpt-3.5-turbo`
### ❓AttributeError: 'str' object has no attribute 'choices'`
这个问题是由于 OpenAI 或者其他 LLM没有返回正确的回复导致的。
@ -375,14 +450,6 @@ pip install Pillow==8.4.0
- 可以提交 [issue](https://github.com/harry0703/MoneyPrinterTurbo/issues)
或者 [pull request](https://github.com/harry0703/MoneyPrinterTurbo/pulls)。
- 也可以关注我的 **抖音****视频号**`网旭哈瑞.AI`
- 我会在上面发布一些 **使用教程****纯技术** 分享。
- 如果有更新和优化,我也会在上面 **及时通知**
- 有问题也可以在上面 **留言**,我会 **尽快回复**
| 抖音 | | 视频号 |
|:---------------------------------------:|:------------:|:-------------------------------------------:|
| <img src="docs/douyin.jpg" width="180"> | | <img src="docs/shipinghao.jpg" width="200"> |
## 参考项目 📚
@ -393,7 +460,6 @@ pip install Pillow==8.4.0
点击查看 [`LICENSE`](LICENSE) 文件
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=harry0703/MoneyPrinterTurbo&type=Date)](https://star-history.com/#harry0703/MoneyPrinterTurbo&Date)

View File

@ -1,31 +1,51 @@
import os
import socket
import toml
import shutil
from loguru import logger
root_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
config_file = f"{root_dir}/config.toml"
if not os.path.isfile(config_file):
example_file = f"{root_dir}/config.example.toml"
if os.path.isfile(example_file):
import shutil
shutil.copyfile(example_file, config_file)
logger.info(f"copy config.example.toml to config.toml")
logger.info(f"load config from file: {config_file}")
def load_config():
# fix: IsADirectoryError: [Errno 21] Is a directory: '/MoneyPrinterTurbo/config.toml'
if os.path.isdir(config_file):
shutil.rmtree(config_file)
try:
_cfg = toml.load(config_file)
except Exception as e:
logger.warning(f"load config failed: {str(e)}, try to load as utf-8-sig")
with open(config_file, mode="r", encoding='utf-8-sig') as fp:
_cfg_content = fp.read()
_cfg = toml.loads(_cfg_content)
if not os.path.isfile(config_file):
example_file = f"{root_dir}/config.example.toml"
if os.path.isfile(example_file):
shutil.copyfile(example_file, config_file)
logger.info(f"copy config.example.toml to config.toml")
logger.info(f"load config from file: {config_file}")
try:
_config_ = toml.load(config_file)
except Exception as e:
logger.warning(f"load config failed: {str(e)}, try to load as utf-8-sig")
with open(config_file, mode="r", encoding='utf-8-sig') as fp:
_cfg_content = fp.read()
_config_ = toml.loads(_cfg_content)
return _config_
def save_config():
with open(config_file, "w", encoding="utf-8") as f:
_cfg["app"] = app
_cfg["whisper"] = whisper
_cfg["pexels"] = pexels
_cfg["azure"] = azure
_cfg["ui"] = ui
f.write(toml.dumps(_cfg))
_cfg = load_config()
app = _cfg.get("app", {})
whisper = _cfg.get("whisper", {})
pexels = _cfg.get("pexels", {})
azure = _cfg.get("azure", {})
ui = _cfg.get("ui", {})
hostname = socket.gethostname()
@ -36,7 +56,7 @@ listen_port = _cfg.get("listen_port", 8080)
project_name = _cfg.get("project_name", "MoneyPrinterTurbo")
project_description = _cfg.get("project_description",
"<a href='https://github.com/harry0703/MoneyPrinterTurbo'>https://github.com/harry0703/MoneyPrinterTurbo</a>")
project_version = _cfg.get("project_version", "1.0.1")
project_version = _cfg.get("project_version", "1.1.2")
reload_debug = False
imagemagick_path = app.get("imagemagick_path", "")
@ -47,18 +67,4 @@ ffmpeg_path = app.get("ffmpeg_path", "")
if ffmpeg_path and os.path.isfile(ffmpeg_path):
os.environ["IMAGEIO_FFMPEG_EXE"] = ffmpeg_path
# __cfg = {
# "hostname": hostname,
# "listen_host": listen_host,
# "listen_port": listen_port,
# }
# logger.info(__cfg)
def save_config():
with open(config_file, "w", encoding="utf-8") as f:
_cfg["app"] = app
_cfg["whisper"] = whisper
_cfg["pexels"] = pexels
f.write(toml.dumps(_cfg))
logger.info(f"{project_name} v{project_version}")

View File

@ -1,8 +1,10 @@
import os
import glob
import pathlib
import shutil
from fastapi import Request, Depends, Path, BackgroundTasks, UploadFile
from fastapi.responses import FileResponse, StreamingResponse
from fastapi.params import File
from loguru import logger
@ -78,7 +80,7 @@ def get_task(request: Request, task_id: str = Path(..., description="Task ID"),
@router.delete("/tasks/{task_id}", response_model=TaskDeletionResponse, summary="Delete a generated short video task")
def create_video(request: Request, task_id: str = Path(..., description="Task ID")):
def delete_video(request: Request, task_id: str = Path(..., description="Task ID")):
request_id = base.get_task_id(request)
task = sm.state.get_task(task_id)
if task:
@ -89,7 +91,7 @@ def create_video(request: Request, task_id: str = Path(..., description="Task ID
sm.state.delete_task(task_id)
logger.success(f"video deleted: {utils.to_json(task)}")
return utils.get_response(200, task)
return utils.get_response(200)
raise HttpException(task_id=task_id, status_code=404, message=f"{request_id}: task not found")
@ -130,3 +132,63 @@ def upload_bgm_file(request: Request, file: UploadFile = File(...)):
return utils.get_response(200, response)
raise HttpException('', status_code=400, message=f"{request_id}: Only *.mp3 files can be uploaded")
@router.get("/stream/{file_path:path}")
async def stream_video(request: Request, file_path: str):
tasks_dir = utils.task_dir()
video_path = os.path.join(tasks_dir, file_path)
range_header = request.headers.get('Range')
video_size = os.path.getsize(video_path)
start, end = 0, video_size - 1
length = video_size
if range_header:
range_ = range_header.split('bytes=')[1]
start, end = [int(part) if part else None for part in range_.split('-')]
if start is None:
start = video_size - end
end = video_size - 1
if end is None:
end = video_size - 1
length = end - start + 1
def file_iterator(file_path, offset=0, bytes_to_read=None):
with open(file_path, 'rb') as f:
f.seek(offset, os.SEEK_SET)
remaining = bytes_to_read or video_size
while remaining > 0:
bytes_to_read = min(4096, remaining)
data = f.read(bytes_to_read)
if not data:
break
remaining -= len(data)
yield data
response = StreamingResponse(file_iterator(video_path, start, length), media_type='video/mp4')
response.headers['Content-Range'] = f'bytes {start}-{end}/{video_size}'
response.headers['Accept-Ranges'] = 'bytes'
response.headers['Content-Length'] = str(length)
response.status_code = 206 # Partial Content
return response
@router.get("/download/{file_path:path}")
async def download_video(_: Request, file_path: str):
"""
download video
:param _: Request request
:param file_path: video file path, eg: /cd1727ed-3473-42a2-a7da-4faafafec72b/final-1.mp4
:return: video file
"""
tasks_dir = utils.task_dir()
video_path = os.path.join(tasks_dir, file_path)
file_path = pathlib.Path(video_path)
filename = file_path.stem
extension = file_path.suffix
headers = {
"Content-Disposition": f"attachment; filename={filename}{extension}"
}
return FileResponse(path=video_path, headers=headers, filename=f"{filename}{extension}",
media_type=f'video/{extension[1:]}')

View File

@ -5,6 +5,8 @@ from typing import List
from loguru import logger
from openai import OpenAI
from openai import AzureOpenAI
from openai.types.chat import ChatCompletion
from app.config import config
@ -57,6 +59,11 @@ def _generate_response(prompt: str) -> str:
api_key = config.app.get("qwen_api_key")
model_name = config.app.get("qwen_model_name")
base_url = "***"
elif llm_provider == "cloudflare":
api_key = config.app.get("cloudflare_api_key")
model_name = config.app.get("cloudflare_model_name")
account_id = config.app.get("cloudflare_account_id")
base_url = "***"
else:
raise ValueError("llm_provider is not set, please set it in the config.toml file.")
@ -69,17 +76,31 @@ def _generate_response(prompt: str) -> str:
if llm_provider == "qwen":
import dashscope
from dashscope.api_entities.dashscope_response import GenerationResponse
dashscope.api_key = api_key
response = dashscope.Generation.call(
model=model_name,
messages=[{"role": "user", "content": prompt}]
)
content = response["output"]["text"]
return content.replace("\n", "")
if response:
if isinstance(response, GenerationResponse):
status_code = response.status_code
if status_code != 200:
raise Exception(
f"[{llm_provider}] returned an error response: \"{response}\"")
content = response["output"]["text"]
return content.replace("\n", "")
else:
raise Exception(
f"[{llm_provider}] returned an invalid response: \"{response}\"")
else:
raise Exception(
f"[{llm_provider}] returned an empty response")
if llm_provider == "gemini":
import google.generativeai as genai
genai.configure(api_key=api_key)
genai.configure(api_key=api_key, transport='rest')
generation_config = {
"temperature": 0.5,
@ -111,10 +132,30 @@ def _generate_response(prompt: str) -> str:
generation_config=generation_config,
safety_settings=safety_settings)
convo = model.start_chat(history=[])
try:
response = model.generate_content(prompt)
candidates = response.candidates
generated_text = candidates[0].content.parts[0].text
except (AttributeError, IndexError) as e:
print("Gemini Error:", e)
convo.send_message(prompt)
return convo.last.text
return generated_text
if llm_provider == "cloudflare":
import requests
response = requests.post(
f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model_name}",
headers={"Authorization": f"Bearer {api_key}"},
json={
"messages": [
{"role": "system", "content": "You are a friendly assistant"},
{"role": "user", "content": prompt}
]
}
)
result = response.json()
logger.info(result)
return result["result"]["response"]
if llm_provider == "azure":
client = AzureOpenAI(
@ -133,7 +174,15 @@ def _generate_response(prompt: str) -> str:
messages=[{"role": "user", "content": prompt}]
)
if response:
content = response.choices[0].message.content
if isinstance(response, ChatCompletion):
content = response.choices[0].message.content
else:
raise Exception(
f"[{llm_provider}] returned an invalid response: \"{response}\", please check your network "
f"connection and try again.")
else:
raise Exception(
f"[{llm_provider}] returned an empty response, please check your network connection and try again.")
return content.replace("\n", "")
@ -149,9 +198,9 @@ Generate a script for a video, depending on the subject of the video.
1. the script is to be returned as a string with the specified number of paragraphs.
2. do not under any circumstance reference this prompt in your response.
3. get straight to the point, don't start with unnecessary things like, "welcome to this video".
4. you must not include any type of markdown or formatting in the script, never use a title.
5. only return the raw content of the script.
6. do not include "voiceover", "narrator" or similar indicators of what should be spoken at the beginning of each paragraph or line.
4. you must not include any type of markdown or formatting in the script, never use a title.
5. only return the raw content of the script.
6. do not include "voiceover", "narrator" or similar indicators of what should be spoken at the beginning of each paragraph or line.
7. you must not mention the prompt, or anything about the script itself. also, never talk about the amount of paragraphs or lines. just write the script.
8. respond in the same language as the video subject.

View File

@ -5,6 +5,7 @@ from urllib.parse import urlencode
import requests
from typing import List
from loguru import logger
from moviepy.video.io.VideoFileClip import VideoFileClip
from app.config import config
from app.models.schema import VideoAspect, VideoConcatMode, MaterialInfo
@ -105,7 +106,19 @@ def save_video(video_url: str, save_dir: str = "") -> str:
f.write(requests.get(video_url, proxies=proxies, verify=False, timeout=(60, 240)).content)
if os.path.exists(video_path) and os.path.getsize(video_path) > 0:
return video_path
try:
clip = VideoFileClip(video_path)
duration = clip.duration
fps = clip.fps
clip.close()
if duration > 0 and fps > 0:
return video_path
except Exception as e:
try:
os.remove(video_path)
except Exception as e:
pass
logger.warning(f"invalid video file: {video_path} => {str(e)}")
return ""

View File

@ -1,7 +1,5 @@
import ast
import json
from abc import ABC, abstractmethod
import redis
from app.config import config
from app.models import const
@ -46,8 +44,9 @@ class MemoryState(BaseState):
# Redis state management
class RedisState(BaseState):
def __init__(self, host='localhost', port=6379, db=0):
self._redis = redis.StrictRedis(host=host, port=port, db=db)
def __init__(self, host='localhost', port=6379, db=0, password=None):
import redis
self._redis = redis.StrictRedis(host=host, port=port, db=db, password=password)
def update_task(self, task_id: str, state: int = const.TASK_STATE_PROCESSING, progress: int = 0, **kwargs):
progress = int(progress)
@ -99,5 +98,6 @@ _enable_redis = config.app.get("enable_redis", False)
_redis_host = config.app.get("redis_host", "localhost")
_redis_port = config.app.get("redis_port", 6379)
_redis_db = config.app.get("redis_db", 0)
_redis_password = config.app.get("redis_password", None)
state = RedisState(host=_redis_host, port=_redis_port, db=_redis_db) if _enable_redis else MemoryState()
state = RedisState(host=_redis_host, port=_redis_port, db=_redis_db, password=_redis_password) if _enable_redis else MemoryState()

View File

@ -1,4 +1,5 @@
import json
import os.path
import re
from faster_whisper import WhisperModel
@ -17,8 +18,13 @@ model = None
def create(audio_file, subtitle_file: str = ""):
global model
if not model:
logger.info(f"loading model: {model_size}, device: {device}, compute_type: {compute_type}")
model = WhisperModel(model_size_or_path=model_size,
model_path = f"{utils.root_dir()}/models/whisper-{model_size}"
model_bin_file = f"{model_path}/model.bin"
if not os.path.isdir(model_path) or not os.path.isfile(model_bin_file):
model_path = model_size
logger.info(f"loading model: {model_path}, device: {device}, compute_type: {compute_type}")
model = WhisperModel(model_size_or_path=model_path,
device=device,
compute_type=compute_type)

View File

@ -4,7 +4,6 @@ from typing import List
from PIL import ImageFont
from loguru import logger
from moviepy.editor import *
from moviepy.video.fx.crop import crop
from moviepy.video.tools.subtitles import SubtitlesClip
from app.models.schema import VideoAspect, VideoParams, VideoConcatMode
@ -14,15 +13,16 @@ from app.utils import utils
def get_bgm_file(bgm_type: str = "random", bgm_file: str = ""):
if not bgm_type:
return ""
if bgm_file and os.path.exists(bgm_file):
return bgm_file
if bgm_type == "random":
suffix = "*.mp3"
song_dir = utils.song_dir()
files = glob.glob(os.path.join(song_dir, suffix))
return random.choice(files)
if os.path.exists(bgm_file):
return bgm_file
return ""
@ -41,6 +41,7 @@ def combine_videos(combined_video_path: str,
req_dur = audio_duration / len(video_paths)
req_dur = max_clip_duration
logger.info(f"each clip will be maximum {req_dur} seconds long")
output_dir = os.path.dirname(combined_video_path)
aspect = VideoAspect(video_aspect)
video_width, video_height = aspect.to_resolution()
@ -99,11 +100,18 @@ def combine_videos(combined_video_path: str,
clips.append(clip)
video_duration += clip.duration
final_clip = concatenate_videoclips(clips)
final_clip = final_clip.set_fps(30)
video_clip = concatenate_videoclips(clips)
video_clip = video_clip.set_fps(30)
logger.info(f"writing")
# https://github.com/harry0703/MoneyPrinterTurbo/issues/111#issuecomment-2032354030
final_clip.write_videofile(combined_video_path, threads=threads, logger=None)
video_clip.write_videofile(filename=combined_video_path,
threads=threads,
logger=None,
temp_audiofile_path=output_dir,
audio_codec="aac",
fps=30,
)
video_clip.close()
logger.success(f"completed")
return combined_video_path
@ -119,9 +127,9 @@ def wrap_text(text, max_width, font='Arial', fontsize=60):
width, height = get_text_size(text)
if width <= max_width:
return text
return text, height
logger.warning(f"wrapping text, max_width: {max_width}, text_width: {width}, text: {text}")
# logger.warning(f"wrapping text, max_width: {max_width}, text_width: {width}, text: {text}")
processed = True
@ -144,8 +152,9 @@ def wrap_text(text, max_width, font='Arial', fontsize=60):
if processed:
_wrapped_lines_ = [line.strip() for line in _wrapped_lines_]
result = '\n'.join(_wrapped_lines_).strip()
logger.warning(f"wrapped text: {result}")
return result
height = len(_wrapped_lines_) * height
# logger.warning(f"wrapped text: {result}")
return result, height
_wrapped_lines_ = []
chars = list(text)
@ -160,8 +169,9 @@ def wrap_text(text, max_width, font='Arial', fontsize=60):
_txt_ = ''
_wrapped_lines_.append(_txt_)
result = '\n'.join(_wrapped_lines_).strip()
logger.warning(f"wrapped text: {result}")
return result
height = len(_wrapped_lines_) * height
# logger.warning(f"wrapped text: {result}")
return result, height
def generate_video(video_path: str,
@ -179,6 +189,11 @@ def generate_video(video_path: str,
logger.info(f" ③ subtitle: {subtitle_path}")
logger.info(f" ④ output: {output_file}")
# https://github.com/harry0703/MoneyPrinterTurbo/issues/217
# PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'final-1.mp4.tempTEMP_MPY_wvf_snd.mp3'
# write into the same directory as the output file
output_dir = os.path.dirname(output_file)
font_path = ""
if params.subtitle_enabled:
if not params.font_name:
@ -189,23 +204,15 @@ def generate_video(video_path: str,
logger.info(f"using font: {font_path}")
if params.subtitle_position == "top":
position_height = video_height * 0.1
elif params.subtitle_position == "bottom":
position_height = video_height * 0.9
else:
position_height = "center"
def generator(txt, **kwargs):
def create_text_clip(subtitle_item):
phrase = subtitle_item[1]
max_width = video_width * 0.9
# logger.debug(f"rendering text: {txt}")
wrapped_txt = wrap_text(txt,
max_width=max_width,
font=font_path,
fontsize=params.font_size
) # 调整max_width以适应你的视频
clip = TextClip(
wrapped_txt, txt_height = wrap_text(phrase,
max_width=max_width,
font=font_path,
fontsize=params.font_size
)
_clip = TextClip(
wrapped_txt,
font=font_path,
fontsize=params.font_size,
@ -215,52 +222,49 @@ def generate_video(video_path: str,
stroke_width=params.stroke_width,
print_cmd=False,
)
return clip
duration = subtitle_item[0][1] - subtitle_item[0][0]
_clip = _clip.set_start(subtitle_item[0][0])
_clip = _clip.set_end(subtitle_item[0][1])
_clip = _clip.set_duration(duration)
if params.subtitle_position == "bottom":
_clip = _clip.set_position(('center', video_height * 0.95 - _clip.h))
elif params.subtitle_position == "top":
_clip = _clip.set_position(('center', video_height * 0.1))
else:
_clip = _clip.set_position(('center', 'center'))
return _clip
clips = [
VideoFileClip(video_path),
]
video_clip = VideoFileClip(video_path)
audio_clip = AudioFileClip(audio_path).volumex(params.voice_volume)
if subtitle_path and os.path.exists(subtitle_path):
sub = SubtitlesClip(subtitles=subtitle_path, make_textclip=generator, encoding='utf-8')
sub_clip = sub.set_position(lambda _t: ('center', position_height))
clips.append(sub_clip)
result = CompositeVideoClip(clips)
audio = AudioFileClip(audio_path)
try:
audio = audio.volumex(params.voice_volume)
except Exception as e:
logger.warning(f"failed to set audio volume: {e}")
result = result.set_audio(audio)
temp_output_file = f"{output_file}.temp.mp4"
logger.info(f"writing to temp file: {temp_output_file}")
result.write_videofile(temp_output_file, threads=params.n_threads or 2, logger=None)
video_clip = VideoFileClip(temp_output_file)
sub = SubtitlesClip(subtitles=subtitle_path, encoding='utf-8')
text_clips = []
for item in sub.subtitles:
clip = create_text_clip(subtitle_item=item)
text_clips.append(clip)
video_clip = CompositeVideoClip([video_clip, *text_clips])
bgm_file = get_bgm_file(bgm_type=params.bgm_type, bgm_file=params.bgm_file)
if bgm_file:
logger.info(f"adding background music: {bgm_file}")
# Add song to video at 30% volume using moviepy
original_duration = video_clip.duration
original_audio = video_clip.audio
song_clip = AudioFileClip(bgm_file).set_fps(44100)
# Set the volume of the song to 10% of the original volume
song_clip = song_clip.volumex(params.bgm_volume)
# Add the song to the video
comp_audio = CompositeAudioClip([original_audio, song_clip])
video_clip = video_clip.set_audio(comp_audio)
video_clip = video_clip.set_fps(30)
video_clip = video_clip.set_duration(original_duration)
try:
bgm_clip = (AudioFileClip(bgm_file)
.volumex(params.bgm_volume)
.audio_fadeout(3))
bgm_clip = afx.audio_loop(bgm_clip, duration=video_clip.duration)
audio_clip = CompositeAudioClip([audio_clip, bgm_clip])
except Exception as e:
logger.error(f"failed to add bgm: {str(e)}")
logger.info(f"encoding audio codec to aac")
video_clip.write_videofile(output_file, audio_codec="aac", threads=params.n_threads or 2, logger=None)
os.remove(temp_output_file)
video_clip = video_clip.set_audio(audio_clip)
video_clip.write_videofile(output_file,
audio_codec="aac",
temp_audiofile_path=output_dir,
threads=params.n_threads or 2,
logger=None,
fps=30,
)
video_clip.close()
logger.success(f"completed")
@ -269,28 +273,28 @@ if __name__ == "__main__":
txt_zh = "测试长字段这是您的旅行技巧指南帮助您进行预算友好的冒险"
font = utils.resource_dir() + "/fonts/STHeitiMedium.ttc"
for txt in [txt_en, txt_zh]:
t = wrap_text(text=txt, max_width=1000, font=font, fontsize=60)
t, h = wrap_text(text=txt, max_width=1000, font=font, fontsize=60)
print(t)
task_id = "69232dfa-f6c5-4b5e-80ba-be3098d3f930"
task_id = "aa563149-a7ea-49c2-b39f-8c32cc225baf"
task_dir = utils.task_dir(task_id)
video_file = f"{task_dir}/combined-1.mp4"
audio_file = f"{task_dir}/audio.mp3"
subtitle_file = f"{task_dir}/subtitle.srt"
output_file = f"{task_dir}/final.mp4"
video_paths = []
for file in os.listdir(utils.storage_dir("test")):
if file.endswith(".mp4"):
video_paths.append(os.path.join(task_dir, file))
combine_videos(combined_video_path=video_file,
audio_file=audio_file,
video_paths=video_paths,
video_aspect=VideoAspect.portrait,
video_concat_mode=VideoConcatMode.random,
max_clip_duration=5,
threads=2)
# video_paths = []
# for file in os.listdir(utils.storage_dir("test")):
# if file.endswith(".mp4"):
# video_paths.append(os.path.join(utils.storage_dir("test"), file))
#
# combine_videos(combined_video_path=video_file,
# audio_file=audio_file,
# video_paths=video_paths,
# video_aspect=VideoAspect.portrait,
# video_concat_mode=VideoConcatMode.random,
# max_clip_duration=5,
# threads=2)
cfg = VideoParams()
cfg.video_aspect = VideoAspect.portrait
@ -300,14 +304,15 @@ if __name__ == "__main__":
cfg.stroke_width = 1.5
cfg.text_fore_color = "#FFFFFF"
cfg.text_background_color = "transparent"
cfg.bgm_type = "random"
cfg.bgm_file = ""
cfg.bgm_volume = 0.2
cfg.bgm_volume = 1.0
cfg.subtitle_enabled = True
cfg.subtitle_position = "bottom"
cfg.n_threads = 2
cfg.paragraph_number = 1
cfg.voice_volume = 3.0
cfg.voice_volume = 1.0
generate_video(video_path=video_file,
audio_path=audio_file,

View File

@ -1,6 +1,7 @@
import asyncio
import os
import re
from datetime import datetime
from xml.sax.saxutils import unescape
from edge_tts.submaker import mktimestamp
from loguru import logger
@ -8,10 +9,11 @@ from edge_tts import submaker, SubMaker
import edge_tts
from moviepy.video.tools import subtitles
from app.config import config
from app.utils import utils
def get_all_voices(filter_locals=None) -> list[str]:
def get_all_azure_voices(filter_locals=None) -> list[str]:
if filter_locals is None:
filter_locals = ["zh-CN", "en-US", "zh-HK", "zh-TW"]
voices_str = """
@ -956,6 +958,34 @@ Gender: Female
Name: zu-ZA-ThembaNeural
Gender: Male
Name: en-US-AvaMultilingualNeural-V2
Gender: Female
Name: en-US-AndrewMultilingualNeural-V2
Gender: Male
Name: en-US-EmmaMultilingualNeural-V2
Gender: Female
Name: en-US-BrianMultilingualNeural-V2
Gender: Male
Name: de-DE-FlorianMultilingualNeural-V2
Gender: Male
Name: de-DE-SeraphinaMultilingualNeural-V2
Gender: Female
Name: fr-FR-RemyMultilingualNeural-V2
Gender: Male
Name: fr-FR-VivienneMultilingualNeural-V2
Gender: Female
Name: zh-CN-XiaoxiaoMultilingualNeural-V2
Gender: Female
""".strip()
voices = []
name = ''
@ -986,11 +1016,26 @@ Gender: Male
def parse_voice_name(name: str):
# zh-CN-XiaoyiNeural-Female
# zh-CN-YunxiNeural-Male
# zh-CN-XiaoxiaoMultilingualNeural-V2-Female
name = name.replace("-Female", "").replace("-Male", "").strip()
return name
def is_azure_v2_voice(voice_name: str):
voice_name = parse_voice_name(voice_name)
print(voice_name)
if voice_name.endswith("-V2"):
return voice_name.replace("-V2", "").strip()
return ""
def tts(text: str, voice_name: str, voice_file: str) -> [SubMaker, None]:
if is_azure_v2_voice(voice_name):
return azure_tts_v2(text, voice_name, voice_file)
return azure_tts_v1(text, voice_name, voice_file)
def azure_tts_v1(text: str, voice_name: str, voice_file: str) -> [SubMaker, None]:
text = text.strip()
for i in range(3):
try:
@ -1019,14 +1064,82 @@ def tts(text: str, voice_name: str, voice_file: str) -> [SubMaker, None]:
return None
def create_subtitle(sub_maker: submaker.SubMaker, text: str, subtitle_file: str):
"""
优化字幕文件
1. 将字幕文件按照标点符号分割成多行
2. 逐行匹配字幕文件中的文本
3. 生成新的字幕文件
"""
text = text.replace("\n", " ")
def azure_tts_v2(text: str, voice_name: str, voice_file: str) -> [SubMaker, None]:
voice_name = is_azure_v2_voice(voice_name)
if not voice_name:
logger.error(f"invalid voice name: {voice_name}")
raise ValueError(f"invalid voice name: {voice_name}")
text = text.strip()
def _format_duration_to_offset(duration) -> int:
if isinstance(duration, str):
time_obj = datetime.strptime(duration, "%H:%M:%S.%f")
milliseconds = (time_obj.hour * 3600000) + (time_obj.minute * 60000) + (time_obj.second * 1000) + (
time_obj.microsecond // 1000)
return milliseconds * 10000
if isinstance(duration, int):
return duration
return 0
for i in range(3):
try:
logger.info(f"start, voice name: {voice_name}, try: {i + 1}")
import azure.cognitiveservices.speech as speechsdk
sub_maker = SubMaker()
def speech_synthesizer_word_boundary_cb(evt: speechsdk.SessionEventArgs):
# print('WordBoundary event:')
# print('\tBoundaryType: {}'.format(evt.boundary_type))
# print('\tAudioOffset: {}ms'.format((evt.audio_offset + 5000)))
# print('\tDuration: {}'.format(evt.duration))
# print('\tText: {}'.format(evt.text))
# print('\tTextOffset: {}'.format(evt.text_offset))
# print('\tWordLength: {}'.format(evt.word_length))
duration = _format_duration_to_offset(str(evt.duration))
offset = _format_duration_to_offset(evt.audio_offset)
sub_maker.subs.append(evt.text)
sub_maker.offset.append((offset, offset + duration))
# Creates an instance of a speech config with specified subscription key and service region.
speech_key = config.azure.get("speech_key", "")
service_region = config.azure.get("speech_region", "")
audio_config = speechsdk.audio.AudioOutputConfig(filename=voice_file, use_default_speaker=True)
speech_config = speechsdk.SpeechConfig(subscription=speech_key,
region=service_region)
speech_config.speech_synthesis_voice_name = voice_name
# speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceResponse_RequestSentenceBoundary,
# value='true')
speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceResponse_RequestWordBoundary,
value='true')
speech_config.set_speech_synthesis_output_format(
speechsdk.SpeechSynthesisOutputFormat.Audio48Khz192KBitRateMonoMp3)
speech_synthesizer = speechsdk.SpeechSynthesizer(audio_config=audio_config,
speech_config=speech_config)
speech_synthesizer.synthesis_word_boundary.connect(speech_synthesizer_word_boundary_cb)
result = speech_synthesizer.speak_text_async(text).get()
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
logger.success(f"azure v2 speech synthesis succeeded: {voice_file}")
return sub_maker
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
logger.error(f"azure v2 speech synthesis canceled: {cancellation_details.reason}")
if cancellation_details.reason == speechsdk.CancellationReason.Error:
logger.error(f"azure v2 speech synthesis error: {cancellation_details.error_details}")
logger.info(f"completed, output file: {voice_file}")
except Exception as e:
logger.error(f"failed, error: {str(e)}")
return None
def _format_text(text: str) -> str:
# text = text.replace("\n", " ")
text = text.replace("[", " ")
text = text.replace("]", " ")
text = text.replace("(", " ")
@ -1034,6 +1147,18 @@ def create_subtitle(sub_maker: submaker.SubMaker, text: str, subtitle_file: str)
text = text.replace("{", " ")
text = text.replace("}", " ")
text = text.strip()
return text
def create_subtitle(sub_maker: submaker.SubMaker, text: str, subtitle_file: str):
"""
优化字幕文件
1. 将字幕文件按照标点符号分割成多行
2. 逐行匹配字幕文件中的文本
3. 生成新的字幕文件
"""
text = _format_text(text)
def formatter(idx: int, start_time: float, end_time: float, sub_text: str) -> str:
"""
@ -1125,8 +1250,12 @@ def get_audio_duration(sub_maker: submaker.SubMaker):
if __name__ == "__main__":
voices = get_all_voices()
print(voices)
voice_name = "zh-CN-XiaoxiaoMultilingualNeural-V2-Female"
voice_name = parse_voice_name(voice_name)
voice_name = is_azure_v2_voice(voice_name)
print(voice_name)
voices = get_all_azure_voices()
print(len(voices))
@ -1134,6 +1263,7 @@ if __name__ == "__main__":
temp_dir = utils.storage_dir("temp")
voice_names = [
"zh-CN-XiaoxiaoMultilingualNeural",
# 女性
"zh-CN-XiaoxiaoNeural",
"zh-CN-XiaoyiNeural",
@ -1156,10 +1286,28 @@ if __name__ == "__main__":
"""
text = "[Opening scene: A sunny day in a suburban neighborhood. A young boy named Alex, around 8 years old, is playing in his front yard with his loyal dog, Buddy.]\n\n[Camera zooms in on Alex as he throws a ball for Buddy to fetch. Buddy excitedly runs after it and brings it back to Alex.]\n\nAlex: Good boy, Buddy! You're the best dog ever!\n\n[Buddy barks happily and wags his tail.]\n\n[As Alex and Buddy continue playing, a series of potential dangers loom nearby, such as a stray dog approaching, a ball rolling towards the street, and a suspicious-looking stranger walking by.]\n\nAlex: Uh oh, Buddy, look out!\n\n[Buddy senses the danger and immediately springs into action. He barks loudly at the stray dog, scaring it away. Then, he rushes to retrieve the ball before it reaches the street and gently nudges it back towards Alex. Finally, he stands protectively between Alex and the stranger, growling softly to warn them away.]\n\nAlex: Wow, Buddy, you're like my superhero!\n\n[Just as Alex and Buddy are about to head inside, they hear a loud crash from a nearby construction site. They rush over to investigate and find a pile of rubble blocking the path of a kitten trapped underneath.]\n\nAlex: Oh no, Buddy, we have to help!\n\n[Buddy barks in agreement and together they work to carefully move the rubble aside, allowing the kitten to escape unharmed. The kitten gratefully nuzzles against Buddy, who responds with a friendly lick.]\n\nAlex: We did it, Buddy! We saved the day again!\n\n[As Alex and Buddy walk home together, the sun begins to set, casting a warm glow over the neighborhood.]\n\nAlex: Thanks for always being there to watch over me, Buddy. You're not just my dog, you're my best friend.\n\n[Buddy barks happily and nuzzles against Alex as they disappear into the sunset, ready to face whatever adventures tomorrow may bring.]\n\n[End scene.]"
text = "大家好,我是乔哥,一个想帮你把信用卡全部还清的家伙!\n今天我们要聊的是信用卡的取现功能。\n你是不是也曾经因为一时的资金紧张而拿着信用卡到ATM机取现如果是那你得好好看看这个视频了。\n现在都2024年了我以为现在不会再有人用信用卡取现功能了。前几天一个粉丝发来一张图片取现1万。\n信用卡取现有三个弊端。\n信用卡取现功能代价可不小。会先收取一个取现手续费比如这个粉丝取现1万按2.5%收取手续费收取了250元。\n信用卡正常消费有最长56天的免息期但取现不享受免息期。从取现那一天开始每天按照万5收取利息这个粉丝用了11天收取了55元利息。\n三,频繁的取现行为,银行会认为你资金紧张,会被标记为高风险用户,影响你的综合评分和额度。\n那么,如果你资金紧张了,该怎么办呢?\n乔哥给你支一招用破思机摩擦信用卡只需要少量的手续费而且还可以享受最长56天的免息期。\n最后,如果你对玩卡感兴趣,可以找乔哥领取一本《卡神秘籍》,用卡过程中遇到任何疑惑,也欢迎找乔哥交流。\n别忘了关注乔哥回复用卡技巧免费领取《2024用卡技巧》让我们一起成为用卡高手"
text = """
2023全年业绩速览
公司全年累计实现营业收入1476.94亿元同比增长19.01%归母净利润747.34亿元同比增长19.16%EPS达到59.49第四季度单季营业收入444.25亿元同比增长20.26%环比增长31.86%归母净利润218.58亿元同比增长19.33%环比增长29.37%这一阶段
的业绩表现不仅突显了公司的增长动力和盈利能力也反映出公司在竞争激烈的市场环境中保持了良好的发展势头
2023年Q4业绩速览
第四季度营业收入贡献主要增长点销售费用高增致盈利能力承压税金同比上升27%扰动净利率表现
业绩解读
利润方面2023全年贵州茅台>归母净利润增速为19%其中营业收入正贡献18%营业成本正贡献百分之一管理费用正贡献百分之一点四(归母净利润增速值=营业收入增速+各科目贡献展示贡献/拖累的前四名科目且要求贡献值/净利润增速>15%)
"""
text = "静夜思是唐代诗人李白创作的一首五言古诗。这首诗描绘了诗人在寂静的夜晚,看到窗前的明月,不禁想起远方的家乡和亲人"
text = _format_text(text)
lines = utils.split_string_by_punctuations(text)
print(lines)
for voice_name in voice_names:
voice_file = f"{temp_dir}/tts-{voice_name}.mp3"
subtitle_file = f"{temp_dir}/tts.mp3.srt"
sub_maker = tts(text=text, voice_name=voice_name, voice_file=voice_file)
sub_maker = azure_tts_v2(text=text, voice_name=voice_name, voice_file=voice_file)
create_subtitle(sub_maker=sub_maker, text=text, subtitle_file=subtitle_file)
audio_duration = get_audio_duration(sub_maker)
print(f"voice: {voice_name}, audio duration: {audio_duration}s")

View File

@ -163,12 +163,34 @@ def str_contains_punctuation(word):
def split_string_by_punctuations(s):
result = []
txt = ""
for char in s:
previous_char = ""
next_char = ""
for i in range(len(s)):
char = s[i]
if char == "\n":
result.append(txt.strip())
txt = ""
continue
if i > 0:
previous_char = s[i - 1]
if i < len(s) - 1:
next_char = s[i + 1]
if char == "." and previous_char.isdigit() and next_char.isdigit():
# 取现1万按2.5%收取手续费, 2.5 中的 . 不能作为换行标记
txt += char
continue
if char not in const.PUNCTUATIONS:
txt += char
else:
result.append(txt.strip())
txt = ""
result.append(txt.strip())
# filter empty string
result = list(filter(None, result))
return result

View File

@ -161,4 +161,10 @@
### Example: "http://user:pass@proxy:1234"
### Doc: https://requests.readthedocs.io/en/latest/user/advanced/#proxies
# http = "http://10.10.1.10:3128"
# https = "http://10.10.1.10:1080"
# https = "http://10.10.1.10:1080"
[azure]
# Azure Speech API Key
# Get your API key at https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices
speech_key=""
speech_region=""

View File

@ -1,8 +1,5 @@
version: "3"
x-common-volumes: &common-volumes
- ./config.toml:/MoneyPrinterTurbo/config.toml
- ./storage:/MoneyPrinterTurbo/storage
- ./:/MoneyPrinterTurbo
services:
webui:
@ -12,7 +9,7 @@ services:
container_name: "webui"
ports:
- "8501:8501"
command: ["streamlit", "run", "./webui/Main.py","--browser.serverAddress=0.0.0.0","--server.enableCORS=True","--browser.gatherUsageStats=False"]
command: [ "streamlit", "run", "./webui/Main.py","--browser.serverAddress=127.0.0.1","--server.enableCORS=True","--browser.gatherUsageStats=False" ]
volumes: *common-volumes
restart: always
api:

BIN
docs/picwish.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 178 KiB

BIN
docs/wechat-03.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 165 KiB

View File

@ -16,4 +16,10 @@ g4f~=0.2.5.4
dashscope~=1.15.0
google.generativeai~=0.4.1
python-multipart~=0.0.9
redis==5.0.3
redis==5.0.3
# if you use pillow~=10.3.0, you will get "PIL.Image' has no attribute 'ANTIALIAS'" error when resize video
# please install opencv-python to fix "PIL.Image' has no attribute 'ANTIALIAS'" error
opencv-python
# for azure speech
# https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/9-more-realistic-ai-voices-for-conversations-now-generally/ba-p/4099471
azure-cognitiveservices-speech~=1.37.0

View File

@ -1,2 +1,7 @@
@echo off
set CURRENT_DIR=%CD%
echo ***** Current directory: %CURRENT_DIR% *****
set PYTHONPATH=%CURRENT_DIR%
rem set HF_ENDPOINT=https://hf-mirror.com
streamlit run .\webui\Main.py --browser.gatherUsageStats=False --server.enableCORS=True

View File

@ -1,5 +1,6 @@
import sys
import os
import time
# Add the root directory of the project to the system path to allow importing modules from the project
root_dir = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
@ -38,7 +39,7 @@ hide_streamlit_style = """
<style>#root > div:nth-child(1) > div > div > div > div > section > div {padding-top: 0rem;}</style>
"""
st.markdown(hide_streamlit_style, unsafe_allow_html=True)
st.title("MoneyPrinterTurbo")
st.title(f"MoneyPrinterTurbo v{config.project_version}")
font_dir = os.path.join(root_dir, "resource", "fonts")
song_dir = os.path.join(root_dir, "resource", "songs")
@ -62,6 +63,7 @@ def get_all_fonts():
for file in files:
if file.endswith(".ttf") or file.endswith(".ttc"):
fonts.append(file)
fonts.sort()
return fonts
@ -164,7 +166,6 @@ with st.expander(tr("Basic Settings"), expanded=False):
code = selected_language.split(" - ")[0].strip()
st.session_state['ui_language'] = code
config.ui['language'] = code
config.save_config()
with middle_config_panel:
# openai
@ -175,7 +176,7 @@ with st.expander(tr("Basic Settings"), expanded=False):
# qwen (通义千问)
# gemini
# ollama
llm_providers = ['OpenAI', 'Moonshot', 'Azure', 'Qwen', 'Gemini', 'Ollama', 'G4f', 'OneAPI']
llm_providers = ['OpenAI', 'Moonshot', 'Azure', 'Qwen', 'Gemini', 'Ollama', 'G4f', 'OneAPI', "Cloudflare"]
saved_llm_provider = config.app.get("llm_provider", "OpenAI").lower()
saved_llm_provider_index = 0
for i, provider in enumerate(llm_providers):
@ -190,6 +191,7 @@ with st.expander(tr("Basic Settings"), expanded=False):
llm_api_key = config.app.get(f"{llm_provider}_api_key", "")
llm_base_url = config.app.get(f"{llm_provider}_base_url", "")
llm_model_name = config.app.get(f"{llm_provider}_model_name", "")
llm_account_id = config.app.get(f"{llm_provider}_account_id", "")
st_llm_api_key = st.text_input(tr("API Key"), value=llm_api_key, type="password")
st_llm_base_url = st.text_input(tr("Base Url"), value=llm_base_url)
st_llm_model_name = st.text_input(tr("Model Name"), value=llm_model_name)
@ -200,7 +202,10 @@ with st.expander(tr("Basic Settings"), expanded=False):
if st_llm_model_name:
config.app[f"{llm_provider}_model_name"] = st_llm_model_name
config.save_config()
if llm_provider == 'cloudflare':
st_llm_account_id = st.text_input(tr("Account ID"), value=llm_account_id)
if st_llm_account_id:
config.app[f"{llm_provider}_account_id"] = st_llm_account_id
with right_config_panel:
pexels_api_keys = config.app.get("pexels_api_keys", [])
@ -212,7 +217,6 @@ with st.expander(tr("Basic Settings"), expanded=False):
pexels_api_key = pexels_api_key.replace(" ", "")
if pexels_api_key:
config.app["pexels_api_keys"] = pexels_api_key.split(",")
config.save_config()
panel = st.columns(3)
left_panel = panel[0]
@ -295,20 +299,20 @@ with middle_panel:
index=0)
with st.container(border=True):
st.write(tr("Audio Settings"))
voices = voice.get_all_voices(filter_locals=["zh-CN", "zh-HK", "zh-TW", "de-DE", "en-US"])
voices = voice.get_all_azure_voices(filter_locals=["zh-CN", "zh-HK", "zh-TW", "de-DE", "en-US", "fr-FR"])
friendly_names = {
voice: voice.
v: v.
replace("Female", tr("Female")).
replace("Male", tr("Male")).
replace("Neural", "") for
voice in voices}
v in voices}
saved_voice_name = config.ui.get("voice_name", "")
saved_voice_name_index = 0
if saved_voice_name in friendly_names:
saved_voice_name_index = list(friendly_names.keys()).index(saved_voice_name)
else:
for i, voice in enumerate(voices):
if voice.lower().startswith(st.session_state['ui_language'].lower()):
for i, v in enumerate(voices):
if v.lower().startswith(st.session_state['ui_language'].lower()):
saved_voice_name_index = i
break
@ -319,7 +323,13 @@ with middle_panel:
voice_name = list(friendly_names.keys())[list(friendly_names.values()).index(selected_friendly_name)]
params.voice_name = voice_name
config.ui['voice_name'] = voice_name
config.save_config()
if voice.is_azure_v2_voice(voice_name):
saved_azure_speech_region = config.azure.get(f"speech_region", "")
saved_azure_speech_key = config.azure.get(f"speech_key", "")
azure_speech_region = st.text_input(tr("Speech Region"), value=saved_azure_speech_region)
azure_speech_key = st.text_input(tr("Speech Key"), value=saved_azure_speech_key, type="password")
config.azure["speech_region"] = azure_speech_region
config.azure["speech_key"] = azure_speech_key
params.voice_volume = st.selectbox(tr("Speech Volume"),
options=[0.6, 0.8, 1.0, 1.2, 1.5, 2.0, 3.0, 4.0, 5.0], index=2)
@ -356,7 +366,6 @@ with right_panel:
saved_font_name_index = font_names.index(saved_font_name)
params.font_name = st.selectbox(tr("Font"), font_names, index=saved_font_name_index)
config.ui['font_name'] = params.font_name
config.save_config()
subtitle_positions = [
(tr("Top"), "top"),
@ -439,3 +448,5 @@ if start_button:
open_task_folder(task_id)
logger.info(tr("Video Generation Completed"))
scroll_to_bottom()
config.save_config()

View File

@ -23,6 +23,8 @@
"Number of Videos Generated Simultaneously": "Anzahl der parallel generierten Videos",
"Audio Settings": "**Audio Einstellungen**",
"Speech Synthesis": "Sprachausgabe",
"Speech Region": "Region(:red[Required[Get Region](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
"Speech Key": "API Key(:red[Required[Get API Key](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
"Speech Volume": "Lautstärke der Sprachausgabe",
"Male": "Männlich",
"Female": "Weiblich",
@ -58,6 +60,6 @@
"Model Name": "Model Name",
"Please Enter the LLM API Key": "Please Enter the **LLM API Key**",
"Please Enter the Pexels API Key": "Please Enter the **Pexels API Key**",
"Get Help": "If you need help, or have any questions, you can join discord for help: https://harryai.cc/moneyprinterturbo"
"Get Help": "If you need help, or have any questions, you can join discord for help: https://harryai.cc"
}
}

View File

@ -23,6 +23,8 @@
"Number of Videos Generated Simultaneously": "Number of Videos Generated Simultaneously",
"Audio Settings": "**Audio Settings**",
"Speech Synthesis": "Speech Synthesis Voice",
"Speech Region": "Region(:red[Required[Get Region](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
"Speech Key": "API Key(:red[Required[Get API Key](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
"Speech Volume": "Speech Volume (1.0 represents 100%)",
"Male": "Male",
"Female": "Female",
@ -55,9 +57,10 @@
"LLM Provider": "LLM Provider",
"API Key": "API Key (:red[Required])",
"Base Url": "Base Url",
"Account ID": "Account ID (Get from Cloudflare dashboard)",
"Model Name": "Model Name",
"Please Enter the LLM API Key": "Please Enter the **LLM API Key**",
"Please Enter the Pexels API Key": "Please Enter the **Pexels API Key**",
"Get Help": "If you need help, or have any questions, you can join discord for help: https://harryai.cc/moneyprinterturbo"
"Get Help": "If you need help, or have any questions, you can join discord for help: https://harryai.cc"
}
}

View File

@ -23,6 +23,8 @@
"Number of Videos Generated Simultaneously": "同时生成视频数量",
"Audio Settings": "**音频设置**",
"Speech Synthesis": "朗读声音(:red[尽量与文案语言保持一致]",
"Speech Region": "服务区域(:red[必填,[点击获取](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
"Speech Key": "API Key(:red[必填,[点击获取](https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SpeechServices)])",
"Speech Volume": "朗读音量1.0表示100%",
"Male": "男性",
"Female": "女性",
@ -55,9 +57,10 @@
"LLM Provider": "大模型提供商",
"API Key": "API Key (:red[必填,需要到大模型提供商的后台申请])",
"Base Url": "Base Url (可选)",
"Account ID": "账户ID (Cloudflare的dash面板url中获取)",
"Model Name": "模型名称 (:blue[需要到大模型提供商的后台确认被授权的模型名称])",
"Please Enter the LLM API Key": "请先填写大模型 **API Key**",
"Please Enter the Pexels API Key": "请先填写 **Pexels API Key**",
"Get Help": "有任何问题或建议,可以加入 **微信群** 求助或讨论https://harryai.cc/moneyprinterturbo"
"Get Help": "有任何问题或建议,可以加入 **微信群** 求助或讨论https://harryai.cc"
}
}