improve and fix stuff

This commit is contained in:
overcrash 2025-06-18 21:40:45 -03:00
parent 85789b40e5
commit 686fd3f307
9 changed files with 1078 additions and 393 deletions

View File

@ -1,12 +0,0 @@
README Updates: The Chinese README (README.md) was heavily revised to include more English content and better structure, while the English README (README-en.md)
was removed entirely. Much of the documentation, quick start, and FAQs are now unified in a single README and available in English.
CHANGELOG.md Added: A new changelog file was created to track updates and features.
Video Processing Improvements (app/services/video.py):
Enhanced the video combining and processing logic by introducing more ffmpeg parameters for improved video encoding (e.g., preset, CRF, pixel format settings).
Improved resizing methods (using 'lanczos') for better video quality.
Updated functions to better handle merging, audio, and encoding parameters for output files.
Added more robust logic for video preprocessing, merging, and clip writing, with commented-out code for optional features.
Windows Web UI Startup Script (webui.bat):
Improved the script to automatically activate a Python virtual environment if present.
Clarified optional Hugging Face mirror settings.
Overall, this commit modernizes the documentation, improves video encoding quality and flexibility, and makes the Windows startup script more robust and user-friendly.

389
README-en.md Normal file
View File

@ -0,0 +1,389 @@
<div align="center">
<h1 align="center">MoneyPrinterTurbo 💸</h1>
<p align="center">
<a href="https://github.com/harry0703/MoneyPrinterTurbo/stargazers"><img src="https://img.shields.io/github/stars/harry0703/MoneyPrinterTurbo.svg?style=for-the-badge" alt="Stargazers"></a>
<a href="https://github.com/harry0703/MoneyPrinterTurbo/issues"><img src="https://img.shields.io/github/issues/harry0703/MoneyPrinterTurbo.svg?style=for-the-badge" alt="Issues"></a>
<a href="https://github.com/harry0703/MoneyPrinterTurbo/network/members"><img src="https://img.shields.io/github/forks/harry0703/MoneyPrinterTurbo.svg?style=for-the-badge" alt="Forks"></a>
<a href="https://github.com/harry0703/MoneyPrinterTurbo/blob/main/LICENSE"><img src="https://img.shields.io/github/license/harry0703/MoneyPrinterTurbo.svg?style=for-the-badge" alt="License"></a>
</p>
<h3>English | <a href="README.md">简体中文</a></h3>
<div align="center">
<a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FMoneyPrinterTurbo | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</div>
Simply provide a <b>topic</b> or <b>keyword</b> for a video, and it will automatically generate the video copy, video
materials, video subtitles, and video background music before synthesizing a high-definition short video.
### WebUI
![](docs/webui-en.jpg)
### API Interface
![](docs/api.jpg)
</div>
## Special Thanks 🙏
Due to the **deployment** and **usage** of this project, there is a certain threshold for some beginner users. We would
like to express our special thanks to
**RecCloud (AI-Powered Multimedia Service Platform)** for providing a free `AI Video Generator` service based on this
project. It allows for online use without deployment, which is very convenient.
- Chinese version: https://reccloud.cn
- English version: https://reccloud.com
![](docs/reccloud.com.jpg)
## Thanks for Sponsorship 🙏
Thanks to Picwish https://picwish.com for supporting and sponsoring this project, enabling continuous updates and maintenance.
Picwish focuses on the **image processing field**, providing a rich set of **image processing tools** that extremely simplify complex operations, truly making image processing easier.
![picwish.jpg](docs/picwish.com.jpg)
## Features 🎯
- [x] Complete **MVC architecture**, **clearly structured** code, easy to maintain, supports both `API`
and `Web interface`
- [x] Supports **AI-generated** video copy, as well as **customized copy**
- [x] Supports various **high-definition video** sizes
- [x] Portrait 9:16, `1080x1920`
- [x] Landscape 16:9, `1920x1080`
- [x] Supports **batch video generation**, allowing the creation of multiple videos at once, then selecting the most
satisfactory one
- [x] Supports setting the **duration of video clips**, facilitating adjustments to material switching frequency
- [x] Supports video copy in both **Chinese** and **English**
- [x] Supports **multiple voice** synthesis, with **real-time preview** of effects
- [x] Supports **subtitle generation**, with adjustable `font`, `position`, `color`, `size`, and also
supports `subtitle outlining`
- [x] Supports **background music**, either random or specified music files, with adjustable `background music volume`
- [x] Video material sources are **high-definition** and **royalty-free**, and you can also use your own **local materials**
- [x] Supports integration with various models such as **OpenAI**, **Moonshot**, **Azure**, **gpt4free**, **one-api**, **Qwen**, **Google Gemini**, **Ollama**, **DeepSeek**, **ERNIE**, **Pollinations** and more
### Future Plans 📅
- [ ] GPT-SoVITS dubbing support
- [ ] Optimize voice synthesis using large models for more natural and emotionally rich voice output
- [ ] Add video transition effects for a smoother viewing experience
- [ ] Add more video material sources, improve the matching between video materials and script
- [ ] Add video length options: short, medium, long
- [ ] Support more voice synthesis providers, such as OpenAI TTS
- [ ] Automate upload to YouTube platform
## Video Demos 📺
### Portrait 9:16
<table>
<thead>
<tr>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> How to Add Fun to Your Life </th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> What is the Meaning of Life</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/a84d33d5-27a2-4aba-8fd0-9fb2bd91c6a6"></video></td>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/112c9564-d52b-4472-99ad-970b75f66476"></video></td>
</tr>
</tbody>
</table>
### Landscape 16:9
<table>
<thead>
<tr>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> What is the Meaning of Life</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> Why Exercise</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/346ebb15-c55f-47a9-a653-114f08bb8073"></video></td>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/271f2fae-8283-44a0-8aa0-0ed8f9a6fa87"></video></td>
</tr>
</tbody>
</table>
## System Requirements 📦
- Recommended minimum 4 CPU cores or more, 4G of memory or more, GPU is not required
- Windows 10 or MacOS 11.0, and their later versions
## Quick Start 🚀
### Run in Google Colab
Want to try MoneyPrinterTurbo without setting up a local environment? Run it directly in Google Colab!
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/harry0703/MoneyPrinterTurbo/blob/main/docs/MoneyPrinterTurbo.ipynb)
### Windows
Google Drive (v1.2.6): https://drive.google.com/file/d/1HsbzfT7XunkrCrHw5ncUjFX8XX4zAuUh/view?usp=sharing
After downloading, it is recommended to **double-click** `update.bat` first to update to the **latest code**, then double-click `start.bat` to launch
After launching, the browser will open automatically (if it opens blank, it is recommended to use **Chrome** or **Edge**)
### Other Systems
One-click startup packages have not been created yet. See the **Installation & Deployment** section below. It is recommended to use **docker** for deployment, which is more convenient.
## Installation & Deployment 📥
### Prerequisites
#### ① Clone the Project
```shell
git clone https://github.com/harry0703/MoneyPrinterTurbo.git
```
#### ② Modify the Configuration File
- Copy the `config.example.toml` file and rename it to `config.toml`
- Follow the instructions in the `config.toml` file to configure `pexels_api_keys` and `llm_provider`, and according to
the llm_provider's service provider, set up the corresponding API Key
### Docker Deployment 🐳
#### ① Launch the Docker Container
If you haven't installed Docker, please install it first https://www.docker.com/products/docker-desktop/
If you are using a Windows system, please refer to Microsoft's documentation:
1. https://learn.microsoft.com/en-us/windows/wsl/install
2. https://learn.microsoft.com/en-us/windows/wsl/tutorials/wsl-containers
```shell
cd MoneyPrinterTurbo
docker-compose up
```
> NoteThe latest version of docker will automatically install docker compose in the form of a plug-in, and the start command is adjusted to `docker compose up `
#### ② Access the Web Interface
Open your browser and visit http://0.0.0.0:8501
#### ③ Access the API Interface
Open your browser and visit http://0.0.0.0:8080/docs Or http://0.0.0.0:8080/redoc
### Manual Deployment 📦
#### ① Create a Python Virtual Environment
It is recommended to create a Python virtual environment using [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)
```shell
git clone https://github.com/harry0703/MoneyPrinterTurbo.git
cd MoneyPrinterTurbo
conda create -n MoneyPrinterTurbo python=3.11
conda activate MoneyPrinterTurbo
pip install -r requirements.txt
```
#### ② Install ImageMagick
###### Windows:
- Download https://imagemagick.org/script/download.php Choose the Windows version, make sure to select the **static library** version, such as ImageMagick-7.1.1-32-Q16-x64-**static**.exe
- Install the downloaded ImageMagick, **do not change the installation path**
- Modify the `config.toml` configuration file, set `imagemagick_path` to your actual installation path
###### MacOS:
```shell
brew install imagemagick
````
###### Ubuntu
```shell
sudo apt-get install imagemagick
```
###### CentOS
```shell
sudo yum install ImageMagick
```
#### ③ Launch the Web Interface 🌐
Note that you need to execute the following commands in the `root directory` of the MoneyPrinterTurbo project
###### Windows
```bat
webui.bat
```
###### MacOS or Linux
```shell
sh webui.sh
```
After launching, the browser will open automatically
#### ④ Launch the API Service 🚀
```shell
python main.py
```
After launching, you can view the `API documentation` at http://127.0.0.1:8080/docs and directly test the interface
online for a quick experience.
## Voice Synthesis 🗣
A list of all supported voices can be viewed here: [Voice List](./docs/voice-list.txt)
2024-04-16 v1.1.2 Added 9 new Azure voice synthesis voices that require API KEY configuration. These voices sound more realistic.
## Subtitle Generation 📜
Currently, there are 2 ways to generate subtitles:
- **edge**: Faster generation speed, better performance, no specific requirements for computer configuration, but the
quality may be unstable
- **whisper**: Slower generation speed, poorer performance, specific requirements for computer configuration, but more
reliable quality
You can switch between them by modifying the `subtitle_provider` in the `config.toml` configuration file
It is recommended to use `edge` mode, and switch to `whisper` mode if the quality of the subtitles generated is not
satisfactory.
> Note:
>
> 1. In whisper mode, you need to download a model file from HuggingFace, about 3GB in size, please ensure good internet connectivity
> 2. If left blank, it means no subtitles will be generated.
> Since HuggingFace is not accessible in China, you can use the following methods to download the `whisper-large-v3` model file
Download links:
- Baidu Netdisk: https://pan.baidu.com/s/11h3Q6tsDtjQKTjUu3sc5cA?pwd=xjs9
- Quark Netdisk: https://pan.quark.cn/s/3ee3d991d64b
After downloading the model, extract it and place the entire directory in `.\MoneyPrinterTurbo\models`,
The final file path should look like this: `.\MoneyPrinterTurbo\models\whisper-large-v3`
```
MoneyPrinterTurbo
├─models
│ └─whisper-large-v3
│ config.json
│ model.bin
│ preprocessor_config.json
│ tokenizer.json
│ vocabulary.json
```
## Background Music 🎵
Background music for videos is located in the project's `resource/songs` directory.
> The current project includes some default music from YouTube videos. If there are copyright issues, please delete
> them.
## Subtitle Fonts 🅰
Fonts for rendering video subtitles are located in the project's `resource/fonts` directory, and you can also add your
own fonts.
## Common Questions 🤔
### ❓RuntimeError: No ffmpeg exe could be found
Normally, ffmpeg will be automatically downloaded and detected.
However, if your environment has issues preventing automatic downloads, you may encounter the following error:
```
RuntimeError: No ffmpeg exe could be found.
Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
```
In this case, you can download ffmpeg from https://www.gyan.dev/ffmpeg/builds/, unzip it, and set `ffmpeg_path` to your
actual installation path.
```toml
[app]
# Please set according to your actual path, note that Windows path separators are \\
ffmpeg_path = "C:\\Users\\harry\\Downloads\\ffmpeg.exe"
```
### ❓ImageMagick is not installed on your computer
[issue 33](https://github.com/harry0703/MoneyPrinterTurbo/issues/33)
1. Follow the `example configuration` provided `download address` to
install https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-30-Q16-x64-static.exe, using the static library
2. Do not install in a path with Chinese characters to avoid unpredictable issues
[issue 54](https://github.com/harry0703/MoneyPrinterTurbo/issues/54#issuecomment-2017842022)
For Linux systems, you can manually install it, refer to https://cn.linux-console.net/?p=16978
Thanks to [@wangwenqiao666](https://github.com/wangwenqiao666) for their research and exploration
### ❓ImageMagick's security policy prevents operations related to temporary file @/tmp/tmpur5hyyto.txt
You can find these policies in ImageMagick's configuration file policy.xml.
This file is usually located in /etc/ImageMagick-`X`/ or a similar location in the ImageMagick installation directory.
Modify the entry containing `pattern="@"`, change `rights="none"` to `rights="read|write"` to allow read and write operations on files.
### ❓OSError: [Errno 24] Too many open files
This issue is caused by the system's limit on the number of open files. You can solve it by modifying the system's file open limit.
Check the current limit:
```shell
ulimit -n
```
If it's too low, you can increase it, for example:
```shell
ulimit -n 10240
```
### ❓Whisper model download failed, with the following error
LocalEntryNotfoundEror: Cannot find an appropriate cached snapshotfolderfor the specified revision on the local disk and
outgoing trafic has been disabled.
To enablerepo look-ups and downloads online, pass 'local files only=False' as input.
or
An error occured while synchronizing the model Systran/faster-whisper-large-v3 from the Hugging Face Hub:
An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the
specified revision on the local disk. Please check your internet connection and try again.
Trying to load the model directly from the local cache, if it exists.
Solution: [Click to see how to manually download the model from netdisk](#subtitle-generation-)
## Feedback & Suggestions 📢
- You can submit an [issue](https://github.com/harry0703/MoneyPrinterTurbo/issues) or
a [pull request](https://github.com/harry0703/MoneyPrinterTurbo/pulls).
## License 📝
Click to view the [`LICENSE`](LICENSE) file
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=harry0703/MoneyPrinterTurbo&type=Date)](https://star-history.com/#harry0703/MoneyPrinterTurbo&Date)

345
README.md
View File

@ -1,75 +1,105 @@
<div align="center">
<h1 align="center">MoneyPrinterTurbo 💸</h1>
<p align="center">
<a href="https://github.com/harry0703/MoneyPrinterTurbo/stargazers"><img src="https://img.shields.io/github/stars/harry0703/MoneyPrinterTurbo.svg?style=for-the-badge" alt="Stargazers"></a>
<a href="https://github.com/harry0703/MoneyPrinterTurbo/issues"><img src="https://img.shields.io/github/issues/harry0703/MoneyPrinterTurbo.svg?style=for-the-badge" alt="Issues"></a>
<a href="https://github.com/harry0703/MoneyPrinterTurbo/network/members"><img src="https://img.shields.io/github/forks/harry0703/MoneyPrinterTurbo.svg?style=for-the-badge" alt="Forks"></a>
<a href="https://github.com/harry0703/MoneyPrinterTurbo/blob/main/LICENSE"><img src="https://img.shields.io/github/license/harry0703/MoneyPrinterTurbo.svg?style=for-the-badge" alt="License"></a>
</p>
<br>
<h3>简体中文 | <a href="README-en.md">English</a></h3>
<div align="center">
<a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FMoneyPrinterTurbo | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</div>
<br>
只需提供一个视频 <b>主题</b><b>关键词</b> ,就可以全自动生成视频文案、视频素材、视频字幕、视频背景音乐,然后合成一个高清的短视频。
<br>
Simply provide a <b>topic</b> or <b>keyword</b> for a video, and it will automatically generate the video copy, video
materials, video subtitles, and video background music before synthesizing a high-definition short video.
<h4>Web界面</h4>
### WebUI
![](docs/webui.jpg)
![](docs/webui-en.jpg)
### API Interface
<h4>API界面</h4>
![](docs/api.jpg)
</div>
## Features 🎯
## 特别感谢 🙏
- [x] Complete **MVC architecture**, **clearly structured** code, easy to maintain, supports both `API`
and `Web interface`
- [x] Supports **AI-generated** video copy, as well as **customized copy**
- [x] Supports various **high-definition video** sizes
- [x] Portrait 9:16, `1080x1920`
- [x] Landscape 16:9, `1920x1080`
- [x] Supports **batch video generation**, allowing the creation of multiple videos at once, then selecting the most
satisfactory one
- [x] Supports setting the **duration of video clips**, facilitating adjustments to material switching frequency
- [x] Supports video copy in both **Chinese** and **English**
- [x] Supports **multiple voice** synthesis, with **real-time preview** of effects
- [x] Supports **subtitle generation**, with adjustable `font`, `position`, `color`, `size`, and also
supports `subtitle outlining`
- [x] Supports **background music**, either random or specified music files, with adjustable `background music volume`
- [x] Video material sources are **high-definition** and **royalty-free**, and you can also use your own **local materials**
- [x] Supports integration with various models such as **OpenAI**, **Moonshot**, **Azure**, **gpt4free**, **one-api**, **Qwen**, **Google Gemini**, **Ollama**, **DeepSeek**, **ERNIE**, **Pollinations** and more
由于该项目的 **部署****使用**,对于一些小白用户来说,还是 **有一定的门槛**,在此特别感谢
**录咖AI智能 多媒体服务平台)** 网站基于该项目,提供的免费`AI视频生成器`服务,可以不用部署,直接在线使用,非常方便。
### Future Plans 📅
- 中文版https://reccloud.cn
- 英文版https://reccloud.com
- [ ] GPT-SoVITS dubbing support
- [ ] Optimize voice synthesis using large models for more natural and emotionally rich voice output
- [ ] Add video transition effects for a smoother viewing experience
- [ ] Add more video material sources, improve the matching between video materials and script
- [ ] Add video length options: short, medium, long
- [ ] Support more voice synthesis providers, such as OpenAI TTS
- [ ] Automate upload to YouTube platform
![](docs/reccloud.cn.jpg)
## Video Demos 📺
## 感谢赞助 🙏
### Portrait 9:16
感谢佐糖 https://picwish.cn 对该项目的支持和赞助,使得该项目能够持续的更新和维护。
佐糖专注于**图像处理领域**,提供丰富的**图像处理工具**,将复杂操作极致简化,真正实现让图像处理更简单。
![picwish.jpg](docs/picwish.jpg)
## 功能特性 🎯
- [x] 完整的 **MVC架构**,代码 **结构清晰**,易于维护,支持 `API``Web界面`
- [x] 支持视频文案 **AI自动生成**,也可以**自定义文案**
- [x] 支持多种 **高清视频** 尺寸
- [x] 竖屏 9:16`1080x1920`
- [x] 横屏 16:9`1920x1080`
- [x] 支持 **批量视频生成**,可以一次生成多个视频,然后选择一个最满意的
- [x] 支持 **视频片段时长** 设置,方便调节素材切换频率
- [x] 支持 **中文****英文** 视频文案
- [x] 支持 **多种语音** 合成,可 **实时试听** 效果
- [x] 支持 **字幕生成**,可以调整 `字体`、`位置`、`颜色`、`大小`,同时支持`字幕描边`设置
- [x] 支持 **背景音乐**,随机或者指定音乐文件,可设置`背景音乐音量`
- [x] 视频素材来源 **高清**,而且 **无版权**,也可以使用自己的 **本地素材**
- [x] 支持 **OpenAI**、**Moonshot**、**Azure**、**gpt4free**、**one-api**、**通义千问**、**Google Gemini**、**Ollama**、**DeepSeek**、 **文心一言**, **Pollinations** 等多种模型接入
- 中国用户建议使用 **DeepSeek****Moonshot** 作为大模型提供商国内可直接访问不需要VPN。注册就送额度基本够用
### 后期计划 📅
- [ ] GPT-SoVITS 配音支持
- [ ] 优化语音合成,利用大模型,使其合成的声音,更加自然,情绪更加丰富
- [ ] 增加视频转场效果,使其看起来更加的流畅
- [ ] 增加更多视频素材来源,优化视频素材和文案的匹配度
- [ ] 增加视频长度选项:短、中、长
- [ ] 支持更多的语音合成服务商,比如 OpenAI TTS
- [ ] 自动上传到YouTube平台
## 视频演示 📺
### 竖屏 9:16
<table>
<thead>
<tr>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> How to Add Fun to Your Life </th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> What is the Meaning of Life</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> 《如何增加生活的乐趣》</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> 《金钱的作用》<br>更真实的合成声音</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> 《生命的意义是什么》</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/a84d33d5-27a2-4aba-8fd0-9fb2bd91c6a6"></video></td>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/af2f3b0b-002e-49fe-b161-18ba91c055e8"></video></td>
<td align="center"><video src="https://github.com/harry0703/MoneyPrinterTurbo/assets/4928832/112c9564-d52b-4472-99ad-970b75f66476"></video></td>
</tr>
</tbody>
</table>
### Landscape 16:9
### 横屏 16:9
<table>
<thead>
<tr>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> What is the Meaning of Life</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji> Why Exercise</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji>《生命的意义是什么》</th>
<th align="center"><g-emoji class="g-emoji" alias="arrow_forward">▶️</g-emoji>《为什么要运动》</th>
</tr>
</thead>
<tbody>
@ -80,79 +110,86 @@ materials, video subtitles, and video background music before synthesizing a hig
</tbody>
</table>
## System Requirements 📦
## 配置要求 📦
- Recommended minimum 4 CPU cores or more, 4G of memory or more, GPU is not required
- Windows 10 or MacOS 11.0, and their later versions
- 建议最低 CPU **4核** 或以上,内存 **4G** 或以上,显卡非必须
- Windows 10 或 MacOS 11.0 以上系统
## New updates and features will be released in the [changelog](CHANGELOG.md) file
## Quick Start 🚀
## 快速开始 🚀
### Run in Google Colab
Want to try MoneyPrinterTurbo without setting up a local environment? Run it directly in Google Colab!
### 在 Google Colab 中运行
免去本地环境配置,点击直接在 Google Colab 中快速体验 MoneyPrinterTurbo
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/harry0703/MoneyPrinterTurbo/blob/main/docs/MoneyPrinterTurbo.ipynb)
### Windows
### Windows一键启动包
Google Drive (v1.2.6): https://drive.google.com/file/d/1HsbzfT7XunkrCrHw5ncUjFX8XX4zAuUh/view?usp=sharing
下载一键启动包,解压直接使用(路径不要有 **中文**、**特殊字符**、**空格**
After downloading, it is recommended to **double-click** `update.bat` first to update to the **latest code**, then double-click `start.bat` to launch
- 百度网盘v1.2.6: https://pan.baidu.com/s/1wg0UaIyXpO3SqIpaq790SQ?pwd=sbqx 提取码: sbqx
- Google Drive (v1.2.6): https://drive.google.com/file/d/1HsbzfT7XunkrCrHw5ncUjFX8XX4zAuUh/view?usp=sharing
After launching, the browser will open automatically (if it opens blank, it is recommended to use **Chrome** or **Edge**)
下载后,建议先**双击执行** `update.bat` 更新到**最新代码**,然后双击 `start.bat` 启动
### Other Systems
启动后,会自动打开浏览器(如果打开是空白,建议换成 **Chrome** 或者 **Edge** 打开)
One-click startup packages have not been created yet. See the **Installation & Deployment** section below. It is recommended to use **docker** for deployment, which is more convenient.
## 安装部署 📥
## Installation & Deployment 📥
### 前提条件
### Prerequisites
- 尽量不要使用 **中文路径**,避免出现一些无法预料的问题
- 请确保你的 **网络** 是正常的VPN需要打开`全局流量`模式
#### ① Clone the Project
#### ① 克隆代码
```shell
git clone https://github.com/harry0703/MoneyPrinterTurbo.git
```
#### ② Modify the Configuration File
#### ② 修改配置文件(可选,建议启动后也可以在 WebUI 里面配置)
- Copy the `config.example.toml` file and rename it to `config.toml`
- Follow the instructions in the `config.toml` file to configure `pexels_api_keys` and `llm_provider`, and according to
the llm_provider's service provider, set up the corresponding API Key
- `config.example.toml` 文件复制一份,命名为 `config.toml`
- 按照 `config.toml` 文件中的说明,配置好 `pexels_api_keys``llm_provider`,并根据 llm_provider 对应的服务商,配置相关的
API Key
### Docker Deployment 🐳
### Docker部署 🐳
#### ① Launch the Docker Container
#### ① 启动Docker
If you haven't installed Docker, please install it first https://www.docker.com/products/docker-desktop/
If you are using a Windows system, please refer to Microsoft's documentation:
如果未安装 Docker请先安装 https://www.docker.com/products/docker-desktop/
1. https://learn.microsoft.com/en-us/windows/wsl/install
2. https://learn.microsoft.com/en-us/windows/wsl/tutorials/wsl-containers
如果是Windows系统请参考微软的文档
1. https://learn.microsoft.com/zh-cn/windows/wsl/install
2. https://learn.microsoft.com/zh-cn/windows/wsl/tutorials/wsl-containers
```shell
cd MoneyPrinterTurbo
docker-compose up
```
> NoteThe latest version of docker will automatically install docker compose in the form of a plug-in, and the start command is adjusted to `docker compose up `
> 注意最新版的docker安装时会自动以插件的形式安装docker compose启动命令调整为docker compose up
#### ② Access the Web Interface
#### ② 访问Web界面
Open your browser and visit http://0.0.0.0:8501
打开浏览器,访问 http://0.0.0.0:8501
#### ③ Access the API Interface
#### ③ 访问API文档
Open your browser and visit http://0.0.0.0:8080/docs Or http://0.0.0.0:8080/redoc
打开浏览器,访问 http://0.0.0.0:8080/docs 或者 http://0.0.0.0:8080/redoc
### Manual Deployment 📦
### 手动部署 📦
#### ① Create a Python Virtual Environment
> 视频教程
It is recommended to create a Python virtual environment using [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)
- 完整的使用演示https://v.douyin.com/iFhnwsKY/
- 如何在Windows上部署https://v.douyin.com/iFyjoW3M
#### ① 创建虚拟环境
建议使用 [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html) 创建 python 虚拟环境
```shell
git clone https://github.com/harry0703/MoneyPrinterTurbo.git
@ -162,35 +199,30 @@ conda activate MoneyPrinterTurbo
pip install -r requirements.txt
```
#### ② Install ImageMagick
#### ② 安装好 ImageMagick
###### Windows:
- Windows:
- 下载 https://imagemagick.org/script/download.php 选择Windows版本切记一定要选择 **静态库** 版本,比如
ImageMagick-7.1.1-32-Q16-x64-**static**.exe
- 安装下载好的 ImageMagick**注意不要修改安装路径**
- 修改 `配置文件 config.toml` 中的 `imagemagick_path` 为你的 **实际安装路径**
- Download https://imagemagick.org/script/download.php Choose the Windows version, make sure to select the **static library** version, such as ImageMagick-7.1.1-32-Q16-x64-**static**.exe
- Install the downloaded ImageMagick, **do not change the installation path**
- Modify the `config.toml` configuration file, set `imagemagick_path` to your actual installation path
- MacOS:
```shell
brew install imagemagick
````
- Ubuntu
```shell
sudo apt-get install imagemagick
```
- CentOS
```shell
sudo yum install ImageMagick
```
###### MacOS:
#### ③ 启动Web界面 🌐
```shell
brew install imagemagick
````
###### Ubuntu
```shell
sudo apt-get install imagemagick
```
###### CentOS
```shell
sudo yum install ImageMagick
```
#### ③ Launch the Web Interface 🌐
Note that you need to execute the following commands in the `root directory` of the MoneyPrinterTurbo project
注意需要到 MoneyPrinterTurbo 项目 `根目录` 下执行以下命令
###### Windows
@ -204,54 +236,50 @@ webui.bat
sh webui.sh
```
After launching, the browser will open automatically
启动后,会自动打开浏览器(如果打开是空白,建议换成 **Chrome** 或者 **Edge** 打开)
#### ④ Launch the API Service 🚀
#### ④ 启动API服务 🚀
```shell
python main.py
```
After launching, you can view the `API documentation` at http://127.0.0.1:8080/docs and directly test the interface
online for a quick experience.
启动后,可以查看 `API文档` http://127.0.0.1:8080/docs 或者 http://127.0.0.1:8080/redoc 直接在线调试接口,快速体验。
## Voice Synthesis 🗣
## 语音合成 🗣
A list of all supported voices can be viewed here: [Voice List](./docs/voice-list.txt)
所有支持的声音列表,可以查看:[声音列表](./docs/voice-list.txt)
2024-04-16 v1.1.2 Added 9 new Azure voice synthesis voices that require API KEY configuration. These voices sound more realistic.
2024-04-16 v1.1.2 新增了9种Azure的语音合成声音需要配置API KEY该声音合成的更加真实。
## Subtitle Generation 📜
## 字幕生成 📜
Currently, there are 2 ways to generate subtitles:
当前支持2种字幕生成方式
- **edge**: Faster generation speed, better performance, no specific requirements for computer configuration, but the
quality may be unstable
- **whisper**: Slower generation speed, poorer performance, specific requirements for computer configuration, but more
reliable quality
- **edge**: 生成`速度快`,性能更好,对电脑配置没有要求,但是质量可能不稳定
- **whisper**: 生成`速度慢`,性能较差,对电脑配置有一定要求,但是`质量更可靠`。
You can switch between them by modifying the `subtitle_provider` in the `config.toml` configuration file
可以修改 `config.toml` 配置文件中的 `subtitle_provider` 进行切换
It is recommended to use `edge` mode, and switch to `whisper` mode if the quality of the subtitles generated is not
satisfactory.
建议使用 `edge` 模式,如果生成的字幕质量不好,再切换到 `whisper` 模式
> Note:
>
> 1. In whisper mode, you need to download a model file from HuggingFace, about 3GB in size, please ensure good internet connectivity
> 2. If left blank, it means no subtitles will be generated.
> 注意:
> Since HuggingFace is not accessible in China, you can use the following methods to download the `whisper-large-v3` model file
1. whisper 模式下需要到 HuggingFace 下载一个模型文件,大约 3GB 左右,请确保网络通畅
2. 如果留空,表示不生成字幕。
Download links:
> 由于国内无法访问 HuggingFace可以使用以下方法下载 `whisper-large-v3` 的模型文件
- Baidu Netdisk: https://pan.baidu.com/s/11h3Q6tsDtjQKTjUu3sc5cA?pwd=xjs9
- Quark Netdisk: https://pan.quark.cn/s/3ee3d991d64b
下载地址:
After downloading the model, extract it and place the entire directory in `.\MoneyPrinterTurbo\models`,
The final file path should look like this: `.\MoneyPrinterTurbo\models\whisper-large-v3`
- 百度网盘: https://pan.baidu.com/s/11h3Q6tsDtjQKTjUu3sc5cA?pwd=xjs9
- 夸克网盘https://pan.quark.cn/s/3ee3d991d64b
模型下载后解压,整个目录放到 `.\MoneyPrinterTurbo\models` 里面,
最终的文件路径应该是这样: `.\MoneyPrinterTurbo\models\whisper-large-v3`
```
MoneyPrinterTurbo
MoneyPrinterTurbo
├─models
│ └─whisper-large-v3
│ config.json
@ -261,98 +289,81 @@ MoneyPrinterTurbo
│ vocabulary.json
```
## Background Music 🎵
## 背景音乐 🎵
Background music for videos is located in the project's `resource/songs` directory.
> The current project includes some default music from YouTube videos. If there are copyright issues, please delete
> them.
用于视频的背景音乐,位于项目的 `resource/songs` 目录下。
> 当前项目里面放了一些默认的音乐,来自于 YouTube 视频,如有侵权,请删除。
## Subtitle Fonts 🅰
## 字幕字体 🅰
Fonts for rendering video subtitles are located in the project's `resource/fonts` directory, and you can also add your
own fonts.
用于视频字幕的渲染,位于项目的 `resource/fonts` 目录下,你也可以放进去自己的字体。
## Common Questions 🤔
## 常见问题 🤔
### ❓RuntimeError: No ffmpeg exe could be found
Normally, ffmpeg will be automatically downloaded and detected.
However, if your environment has issues preventing automatic downloads, you may encounter the following error:
通常情况下ffmpeg 会被自动下载,并且会被自动检测到。
但是如果你的环境有问题,无法自动下载,可能会遇到如下错误:
```
RuntimeError: No ffmpeg exe could be found.
Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
```
In this case, you can download ffmpeg from https://www.gyan.dev/ffmpeg/builds/, unzip it, and set `ffmpeg_path` to your
actual installation path.
此时你可以从 https://www.gyan.dev/ffmpeg/builds/ 下载ffmpeg解压后设置 `ffmpeg_path` 为你的实际安装路径即可。
```toml
[app]
# Please set according to your actual path, note that Windows path separators are \\
# 请根据你的实际路径设置,注意 Windows 路径分隔符为 \\
ffmpeg_path = "C:\\Users\\harry\\Downloads\\ffmpeg.exe"
```
### ❓ImageMagick is not installed on your computer
### ❓ImageMagick的安全策略阻止了与临时文件@/tmp/tmpur5hyyto.txt相关的操作
[issue 33](https://github.com/harry0703/MoneyPrinterTurbo/issues/33)
1. Follow the `example configuration` provided `download address` to
install https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-30-Q16-x64-static.exe, using the static library
2. Do not install in a path with Chinese characters to avoid unpredictable issues
[issue 54](https://github.com/harry0703/MoneyPrinterTurbo/issues/54#issuecomment-2017842022)
For Linux systems, you can manually install it, refer to https://cn.linux-console.net/?p=16978
Thanks to [@wangwenqiao666](https://github.com/wangwenqiao666) for their research and exploration
### ❓ImageMagick's security policy prevents operations related to temporary file @/tmp/tmpur5hyyto.txt
You can find these policies in ImageMagick's configuration file policy.xml.
This file is usually located in /etc/ImageMagick-`X`/ or a similar location in the ImageMagick installation directory.
Modify the entry containing `pattern="@"`, change `rights="none"` to `rights="read|write"` to allow read and write operations on files.
可以在ImageMagick的配置文件policy.xml中找到这些策略。
这个文件通常位于 /etc/ImageMagick-`X`/ 或 ImageMagick 安装目录的类似位置。
修改包含`pattern="@"`的条目,将`rights="none"`更改为`rights="read|write"`以允许对文件的读写操作。
### ❓OSError: [Errno 24] Too many open files
This issue is caused by the system's limit on the number of open files. You can solve it by modifying the system's file open limit.
这个问题是由于系统打开文件数限制导致的,可以通过修改系统的文件打开数限制来解决。
Check the current limit:
查看当前限制
```shell
ulimit -n
```
If it's too low, you can increase it, for example:
如果过低,可以调高一些,比如
```shell
ulimit -n 10240
```
### ❓Whisper model download failed, with the following error
### ❓Whisper 模型下载失败,出现如下错误
LocalEntryNotfoundEror: Cannot find an appropriate cached snapshotfolderfor the specified revision on the local disk and
outgoing trafic has been disabled.
To enablerepo look-ups and downloads online, pass 'local files only=False' as input.
or
或者
An error occured while synchronizing the model Systran/faster-whisper-large-v3 from the Hugging Face Hub:
An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the
specified revision on the local disk. Please check your internet connection and try again.
Trying to load the model directly from the local cache, if it exists.
Solution: [Click to see how to manually download the model from netdisk](#subtitle-generation-)
解决方法:[点击查看如何从网盘手动下载模型](#%E5%AD%97%E5%B9%95%E7%94%9F%E6%88%90-)
## Feedback & Suggestions 📢
## 反馈建议 📢
- You can submit an [issue](https://github.com/harry0703/MoneyPrinterTurbo/issues) or
a [pull request](https://github.com/harry0703/MoneyPrinterTurbo/pulls).
- 可以提交 [issue](https://github.com/harry0703/MoneyPrinterTurbo/issues)
或者 [pull request](https://github.com/harry0703/MoneyPrinterTurbo/pulls)。
## License 📝
## 许可证 📝
Click to view the [`LICENSE`](LICENSE) file
点击查看 [`LICENSE`](LICENSE) 文件
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=harry0703/MoneyPrinterTurbo&type=Date)](https://star-history.com/#harry0703/MoneyPrinterTurbo&Date)
[![Star History Chart](https://api.star-history.com/svg?repos=harry0703/MoneyPrinterTurbo&type=Date)](https://star-history.com/#harry0703/MoneyPrinterTurbo&Date)

View File

@ -4,7 +4,7 @@ import pathlib
import shutil
from typing import Union
from fastapi import BackgroundTasks, Depends, Path, Request, UploadFile
from fastapi import BackgroundTasks, Depends, Path, Query, Request, UploadFile
from fastapi.params import File
from fastapi.responses import FileResponse, StreamingResponse
from loguru import logger
@ -41,7 +41,10 @@ _redis_db = config.app.get("redis_db", 0)
_redis_password = config.app.get("redis_password", None)
_max_concurrent_tasks = config.app.get("max_concurrent_tasks", 5)
redis_url = f"redis://:{_redis_password}@{_redis_host}:{_redis_port}/{_redis_db}"
if _redis_password:
redis_url = f"redis://:{_redis_password}@{_redis_host}:{_redis_port}/{_redis_db}"
else:
redis_url = f"redis://{_redis_host}:{_redis_port}/{_redis_db}"
# 根据配置选择合适的任务管理器
if _enable_redis:
task_manager = RedisTaskManager(
@ -94,8 +97,6 @@ def create_task(
task_id=task_id, status_code=400, message=f"{request_id}: {str(e)}"
)
from fastapi import Query
@router.get("/tasks", response_model=TaskQueryResponse, summary="Get all tasks")
def get_all_tasks(request: Request, page: int = Query(1, ge=1), page_size: int = Query(10, ge=1)):
request_id = base.get_task_id(request)
@ -131,7 +132,7 @@ def get_task(
def file_to_uri(file):
if not file.startswith(endpoint):
_uri_path = v.replace(task_dir, "tasks").replace("\\", "/")
_uri_path = file.replace(task_dir, "tasks").replace("\\", "/")
_uri_path = f"{endpoint}/{_uri_path}"
else:
_uri_path = file
@ -227,20 +228,44 @@ def upload_bgm_file(request: Request, file: UploadFile = File(...)):
async def stream_video(request: Request, file_path: str):
tasks_dir = utils.task_dir()
video_path = os.path.join(tasks_dir, file_path)
# Check if the file exists
if not os.path.exists(video_path):
raise HttpException(
"", status_code=404, message=f"File not found: {file_path}"
)
range_header = request.headers.get("Range")
video_size = os.path.getsize(video_path)
start, end = 0, video_size - 1
length = video_size
if range_header:
range_ = range_header.split("bytes=")[1]
start, end = [int(part) if part else None for part in range_.split("-")]
if start is None:
start = video_size - end
end = video_size - 1
if end is None:
end = video_size - 1
length = end - start + 1
try:
range_ = range_header.split("bytes=")[1]
start, end = [int(part) if part else None for part in range_.split("-")]
if start is None and end is not None:
# Format: bytes=-N (last N bytes)
start = max(0, video_size - end)
end = video_size - 1
elif end is None:
# Format: bytes=N- (from byte N to the end)
end = video_size - 1
# Ensure values are within valid range
start = max(0, min(start, video_size - 1))
end = min(end, video_size - 1)
if start > end:
# Invalid range, serve entire file
start, end = 0, video_size - 1
length = end - start + 1
except (ValueError, IndexError):
# On parsing error, serve entire content
start, end = 0, video_size - 1
length = video_size
def file_iterator(file_path, offset=0, bytes_to_read=None):
with open(file_path, "rb") as f:
@ -258,30 +283,54 @@ async def stream_video(request: Request, file_path: str):
file_iterator(video_path, start, length), media_type="video/mp4"
)
response.headers["Content-Range"] = f"bytes {start}-{end}/{video_size}"
response.headers["Accept-Ranges"] = "bytes"
response.headers["Content-Length"] = str(length)
response.status_code = 206 # Partial Content
return response
@router.get("/download/{file_path:path}")
async def download_video(_: Request, file_path: str):
async def download_video(request: Request, file_path: str):
"""
download video
:param _: Request request
:param request: Request request
:param file_path: video file path, eg: /cd1727ed-3473-42a2-a7da-4faafafec72b/final-1.mp4
:return: video file
"""
tasks_dir = utils.task_dir()
video_path = os.path.join(tasks_dir, file_path)
file_path = pathlib.Path(video_path)
filename = file_path.stem
extension = file_path.suffix
headers = {"Content-Disposition": f"attachment; filename={filename}{extension}"}
return FileResponse(
path=video_path,
headers=headers,
filename=f"{filename}{extension}",
media_type=f"video/{extension[1:]}",
)
try:
tasks_dir = utils.task_dir()
video_path = os.path.join(tasks_dir, file_path)
# Check if the file exists
if not os.path.exists(video_path):
raise HttpException(
"", status_code=404, message=f"File not found: {file_path}"
)
# Check if the file is readable
if not os.access(video_path, os.R_OK):
logger.error(f"File not readable: {video_path}")
raise HttpException(
"", status_code=403, message=f"File not accessible: {file_path}"
)
# Get the filename and extension
path_obj = pathlib.Path(video_path)
filename = path_obj.stem
extension = path_obj.suffix
# Determine appropriate media type
media_type = "application/octet-stream"
if extension.lower() in ['.mp4', '.webm']:
media_type = f"video/{extension[1:]}"
headers = {"Content-Disposition": f"attachment; filename={filename}{extension}"}
logger.info(f"Sending file: {video_path}, size: {os.path.getsize(video_path)}")
return FileResponse(
path=video_path,
headers=headers,
filename=f"{filename}{extension}",
media_type=media_type,
)
except Exception as e:
logger.exception(f"Error downloading file: {str(e)}")
raise HttpException(
"", status_code=500, message=f"Failed to download file: {str(e)}"
)

View File

@ -1,7 +1,6 @@
import json
import logging
import re
import requests
from typing import List
import g4f
@ -83,61 +82,23 @@ def _generate_response(prompt: str) -> str:
raise ValueError(
f"{llm_provider}: secret_key is not set, please set it in the config.toml file."
)
elif llm_provider == "pollinations":
try:
base_url = config.app.get("pollinations_base_url", "")
if not base_url:
base_url = "https://text.pollinations.ai/openai"
model_name = config.app.get("pollinations_model_name", "openai-fast")
# Prepare the payload
payload = {
"model": model_name,
"messages": [
{"role": "user", "content": prompt}
],
"seed": 101 # Optional but helps with reproducibility
}
# Optional parameters if configured
if config.app.get("pollinations_private"):
payload["private"] = True
if config.app.get("pollinations_referrer"):
payload["referrer"] = config.app.get("pollinations_referrer")
headers = {
"Content-Type": "application/json"
}
# Make the API request
response = requests.post(base_url, headers=headers, json=payload)
response.raise_for_status()
result = response.json()
if result and "choices" in result and len(result["choices"]) > 0:
content = result["choices"][0]["message"]["content"]
return content.replace("\n", "")
else:
raise Exception(f"[{llm_provider}] returned an invalid response format")
except requests.exceptions.RequestException as e:
raise Exception(f"[{llm_provider}] request failed: {str(e)}")
except Exception as e:
raise Exception(f"[{llm_provider}] error: {str(e)}")
else:
raise ValueError(
"llm_provider is not set, please set it in the config.toml file."
)
if llm_provider not in ["pollinations", "ollama"]: # Skip validation for providers that don't require API key
if not api_key:
raise ValueError(
f"{llm_provider}: api_key is not set, please set it in the config.toml file."
)
if not model_name:
raise ValueError(
f"{llm_provider}: model_name is not set, please set it in the config.toml file."
)
if not base_url:
raise ValueError(
f"{llm_provider}: base_url is not set, please set it in the config.toml file."
)
if not api_key:
raise ValueError(
f"{llm_provider}: api_key is not set, please set it in the config.toml file."
)
if not model_name:
raise ValueError(
f"{llm_provider}: model_name is not set, please set it in the config.toml file."
)
if not base_url:
raise ValueError(
f"{llm_provider}: base_url is not set, please set it in the config.toml file."
)
if llm_provider == "qwen":
import dashscope
@ -211,6 +172,8 @@ def _generate_response(prompt: str) -> str:
return generated_text
if llm_provider == "cloudflare":
import requests
response = requests.post(
f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model_name}",
headers={"Authorization": f"Bearer {api_key}"},
@ -229,15 +192,20 @@ def _generate_response(prompt: str) -> str:
return result["result"]["response"]
if llm_provider == "ernie":
response = requests.post(
"https://aip.baidubce.com/oauth/2.0/token",
params={
"grant_type": "client_credentials",
"client_id": api_key,
"client_secret": secret_key,
}
import requests
params = {
"grant_type": "client_credentials",
"client_id": api_key,
"client_secret": secret_key,
}
access_token = (
requests.post(
"https://aip.baidubce.com/oauth/2.0/token", params=params
)
.json()
.get("access_token")
)
access_token = response.json().get("access_token")
url = f"{base_url}?access_token={access_token}"
payload = json.dumps(
@ -441,4 +409,3 @@ if __name__ == "__main__":
)
print("######################")
print(search_terms)

View File

@ -1,6 +1,6 @@
from moviepy import Clip, vfx
#from moviepy import Clip
#import moviepy.video.fx.all as vfx
# FadeIn
def fadein_transition(clip: Clip, t: float) -> Clip:
return clip.with_effects([vfx.FadeIn(t)])

View File

@ -4,7 +4,9 @@ import os
import random
import gc
import shutil
import uuid
from typing import List
import multiprocessing
from loguru import logger
from moviepy import (
AudioFileClip,
@ -18,7 +20,8 @@ from moviepy import (
concatenate_videoclips,
)
from moviepy.video.tools.subtitles import SubtitlesClip
from PIL import ImageFont
from moviepy.video.io.ffmpeg_writer import FFMPEG_VideoWriter
from PIL import Image, ImageEnhance, ImageFont
from app.models import const
from app.models.schema import (
@ -47,45 +50,135 @@ class SubClippedVideoClip:
return f"SubClippedVideoClip(file_path={self.file_path}, start_time={self.start_time}, end_time={self.end_time}, duration={self.duration}, width={self.width}, height={self.height})"
# Improved video quality settings
audio_codec = "aac"
video_codec = "libx264"
fps = 30
video_bitrate = "25M" # Increased from 15M for better quality
audio_bitrate = "320k" # Increased from 192k for better audio quality
crf = "15" # Adjusted from 16 - better balance between quality and file size
preset = "slower" # Changed from slower - better balance between speed and compression
def get_optimal_encoding_params(width, height, content_type="video"):
"""Get optimal encoding parameters based on resolution and content type."""
pixels = width * height
# Adjust settings based on resolution and content
if content_type == "image":
# Images need higher quality settings
if pixels >= 1920 * 1080: # 1080p+
return {"crf": "12", "bitrate": "35M", "preset": "slower"}
elif pixels >= 1280 * 720: # 720p+
return {"crf": "16", "bitrate": "30M", "preset": "slower"}
else:
return {"crf": "18", "bitrate": "25M", "preset": "slow"}
else:
# Regular video content
if pixels >= 1920 * 1080: # 1080p+
return {"crf": "18", "bitrate": "30M", "preset": "slower"}
elif pixels >= 1280 * 720: # 720p+
return {"crf": "20", "bitrate": "25M", "preset": "slower"}
else:
return {"crf": "22", "bitrate": "20M", "preset": "slow"}
def get_standard_ffmpeg_params(width, height, content_type="video"):
"""Get standardized FFmpeg parameters for consistent quality."""
params = get_optimal_encoding_params(width, height, content_type)
if content_type == "image" or (width * height >= 1920 * 1080):
# Use higher quality for images and high-res content
pix_fmt = "yuv444p"
else:
# Use more compatible format for standard video
pix_fmt = "yuv420p"
return [
"-crf", params["crf"],
"-preset", params["preset"],
"-profile:v", "high",
"-level", "4.1",
"-x264-params", "keyint=60:min-keyint=60:scenecut=0:ref=3:bframes=3:b-adapt=2:direct=auto:me=umh:subme=8:trellis=2:aq-mode=2",
"-pix_fmt", pix_fmt,
"-movflags", "+faststart",
"-tune", "film",
"-colorspace", "bt709",
"-color_primaries", "bt709",
"-color_trc", "bt709",
"-color_range", "tv",
"-bf", "5", # More B-frames for better compression
"-g", "60", # GOP size
"-qmin", "10", # Minimum quantizer
"-qmax", "51", # Maximum quantizer
"-qdiff", "4", # Max difference between quantizers
"-sc_threshold", "40", # Scene change threshold
"-flags", "+cgop+mv4" # Additional encoding flags
]
def ensure_even_dimensions(width, height):
"""Ensure dimensions are even numbers (required for h264)."""
width = width if width % 2 == 0 else width - 1
height = height if height % 2 == 0 else height - 1
return width, height
def close_clip(clip):
if clip is None:
return
try:
# close main resources
if hasattr(clip, 'reader') and clip.reader is not None:
clip.reader.close()
# close audio resources
if hasattr(clip, 'audio') and clip.audio is not None:
if hasattr(clip.audio, 'reader') and clip.audio.reader is not None:
clip.audio.reader.close()
del clip.audio
# close mask resources
if hasattr(clip, 'mask') and clip.mask is not None:
if hasattr(clip.mask, 'reader') and clip.mask.reader is not None:
clip.mask.reader.close()
del clip.mask
# handle child clips in composite clips
# handle child clips in composite clips first
if hasattr(clip, 'clips') and clip.clips:
for child_clip in clip.clips:
if child_clip is not clip: # avoid possible circular references
close_clip(child_clip)
# close audio resources with better error handling
if hasattr(clip, 'audio') and clip.audio is not None:
if hasattr(clip.audio, 'reader') and clip.audio.reader is not None:
try:
# Check if the reader is still valid before closing
if hasattr(clip.audio.reader, 'proc') and clip.audio.reader.proc is not None:
if clip.audio.reader.proc.poll() is None:
clip.audio.reader.close()
else:
clip.audio.reader.close()
except (OSError, AttributeError):
# Handle invalid handles and missing attributes
pass
clip.audio = None
# close mask resources
if hasattr(clip, 'mask') and clip.mask is not None:
if hasattr(clip.mask, 'reader') and clip.mask.reader is not None:
try:
clip.mask.reader.close()
except (OSError, AttributeError):
pass
clip.mask = None
# close main resources
if hasattr(clip, 'reader') and clip.reader is not None:
try:
clip.reader.close()
except (OSError, AttributeError):
pass
# clear clip list
if hasattr(clip, 'clips'):
clip.clips = []
# call clip's own close method if it exists
if hasattr(clip, 'close'):
try:
clip.close()
except (OSError, AttributeError):
pass
except Exception as e:
logger.error(f"failed to close clip: {str(e)}")
del clip
try:
del clip
except:
pass
gc.collect()
def delete_files(files: List[str] | str):
@ -94,9 +187,10 @@ def delete_files(files: List[str] | str):
for file in files:
try:
os.remove(file)
except:
pass
if os.path.exists(file):
os.remove(file)
except Exception as e:
logger.debug(f"failed to delete file {file}: {str(e)}")
def get_bgm_file(bgm_type: str = "random", bgm_file: str = ""):
if not bgm_type:
@ -109,11 +203,11 @@ def get_bgm_file(bgm_type: str = "random", bgm_file: str = ""):
suffix = "*.mp3"
song_dir = utils.song_dir()
files = glob.glob(os.path.join(song_dir, suffix))
return random.choice(files)
if files:
return random.choice(files)
return ""
def combine_videos(
combined_video_path: str,
video_paths: List[str],
@ -122,23 +216,25 @@ def combine_videos(
video_concat_mode: VideoConcatMode = VideoConcatMode.random,
video_transition_mode: VideoTransitionMode = None,
max_clip_duration: int = 5,
threads: int = 2,
#threads: int = 2,
threads = min(multiprocessing.cpu_count(), 6),
) -> str:
audio_clip = AudioFileClip(audio_file)
audio_duration = audio_clip.duration
logger.info(f"audio duration: {audio_duration} seconds")
# Required duration of each clip
req_dur = audio_duration / len(video_paths)
req_dur = max_clip_duration
logger.info(f"maximum clip duration: {req_dur} seconds")
req_dur = min(audio_duration / len(video_paths), max_clip_duration)
logger.info(f"calculated clip duration: {req_dur} seconds")
output_dir = os.path.dirname(combined_video_path)
aspect = VideoAspect(video_aspect)
video_width, video_height = aspect.to_resolution()
video_width, video_height = ensure_even_dimensions(video_width, video_height)
processed_clips = []
subclipped_items = []
video_duration = 0
for video_path in video_paths:
clip = VideoFileClip(video_path)
clip_duration = clip.duration
@ -150,7 +246,7 @@ def combine_videos(
while start_time < clip_duration:
end_time = min(start_time + max_clip_duration, clip_duration)
if clip_duration - start_time >= max_clip_duration:
subclipped_items.append(SubClippedVideoClip(file_path= video_path, start_time=start_time, end_time=end_time, width=clip_w, height=clip_h))
subclipped_items.append(SubClippedVideoClip(file_path=video_path, start_time=start_time, end_time=end_time, width=clip_w, height=clip_h))
start_time = end_time
if video_concat_mode.value == VideoConcatMode.sequential.value:
break
@ -173,14 +269,16 @@ def combine_videos(
clip_duration = clip.duration
# Not all videos are same size, so we need to resize them
clip_w, clip_h = clip.size
if clip_w != video_width or clip_h != video_height:
clip_ratio = clip.w / clip.h
video_ratio = video_width / video_height
logger.debug(f"resizing clip, source: {clip_w}x{clip_h}, ratio: {clip_ratio:.2f}, target: {video_width}x{video_height}, ratio: {video_ratio:.2f}")
if clip_ratio == video_ratio:
if abs(clip_ratio - video_ratio) < 0.01: # Almost same ratio
clip = clip.resized(new_size=(video_width, video_height))
else:
# Use better scaling algorithm for quality
if clip_ratio > video_ratio:
scale_factor = video_width / clip_w
else:
@ -188,13 +286,16 @@ def combine_videos(
new_width = int(clip_w * scale_factor)
new_height = int(clip_h * scale_factor)
# Ensure dimensions are even numbers
new_width, new_height = ensure_even_dimensions(new_width, new_height)
background = ColorClip(size=(video_width, video_height), color=(0, 0, 0)).with_duration(clip_duration)
clip_resized = clip.resized(new_size=(new_width, new_height)).with_position("center")
clip = CompositeVideoClip([background, clip_resized])
shuffle_side = random.choice(["left", "right", "top", "bottom"])
if video_transition_mode.value == VideoTransitionMode.none.value:
if video_transition_mode is None or video_transition_mode.value == VideoTransitionMode.none.value:
clip = clip
elif video_transition_mode.value == VideoTransitionMode.fade_in.value:
clip = video_effects.fadein_transition(clip, 1)
@ -217,14 +318,24 @@ def combine_videos(
if clip.duration > max_clip_duration:
clip = clip.subclipped(0, max_clip_duration)
# wirte clip to temp file
# Write clip to temp file with improved quality settings
clip_file = f"{output_dir}/temp-clip-{i+1}.mp4"
clip.write_videofile(clip_file, logger=None, fps=fps, codec=video_codec)
encoding_params = get_optimal_encoding_params(video_width, video_height, "video")
clip.write_videofile(clip_file,
logger=None,
fps=fps,
codec=video_codec,
# Remove bitrate parameter as it conflicts with CRF in ffmpeg_params
ffmpeg_params=get_standard_ffmpeg_params(video_width, video_height, "video")
)
# Store clip duration before closing
clip_duration_value = clip.duration
close_clip(clip)
processed_clips.append(SubClippedVideoClip(file_path=clip_file, duration=clip.duration, width=clip_w, height=clip_h))
video_duration += clip.duration
processed_clips.append(SubClippedVideoClip(file_path=clip_file, duration=clip_duration_value, width=clip_w, height=clip_h))
video_duration += clip_duration_value
except Exception as e:
logger.error(f"failed to process clip: {str(e)}")
@ -250,62 +361,62 @@ def combine_videos(
if len(processed_clips) == 1:
logger.info("using single clip directly")
shutil.copy(processed_clips[0].file_path, combined_video_path)
delete_files(processed_clips)
delete_files([clip.file_path for clip in processed_clips])
logger.info("video combining completed")
return combined_video_path
# create initial video file as base
base_clip_path = processed_clips[0].file_path
temp_merged_video = f"{output_dir}/temp-merged-video.mp4"
temp_merged_next = f"{output_dir}/temp-merged-next.mp4"
# copy first clip as initial merged video
shutil.copy(base_clip_path, temp_merged_video)
# merge remaining video clips one by one
for i, clip in enumerate(processed_clips[1:], 1):
logger.info(f"merging clip {i}/{len(processed_clips)-1}, duration: {clip.duration:.2f}s")
try:
# Load all processed clips
video_clips = []
for clip_info in processed_clips:
try:
clip = VideoFileClip(clip_info.file_path)
if clip.duration > 0 and hasattr(clip, 'size') and None not in clip.size:
video_clips.append(clip)
else:
logger.warning(f"Skipping invalid clip: {clip_info.file_path}")
close_clip(clip)
except Exception as e:
logger.error(f"Failed to load clip {clip_info.file_path}: {str(e)}")
if not video_clips:
logger.error("No valid clips could be loaded for final concatenation")
return ""
# Concatenate all clips at once with compose method for better quality
logger.info(f"Concatenating {len(video_clips)} clips in a single operation")
final_clip = concatenate_videoclips(video_clips, method="compose")
try:
# load current base video and next clip to merge
base_clip = VideoFileClip(temp_merged_video)
next_clip = VideoFileClip(clip.file_path)
# Write the final result directly
encoding_params = get_optimal_encoding_params(video_width, video_height, "video")
logger.info(f"Writing final video with quality settings: CRF {encoding_params['crf']}, preset {encoding_params['preset']}")
final_clip.write_videofile(
combined_video_path,
threads=threads,
logger=None,
temp_audiofile_path=os.path.dirname(combined_video_path),
audio_codec=audio_codec,
fps=fps,
ffmpeg_params=get_standard_ffmpeg_params(video_width, video_height, "video")
)
# Close all clips
close_clip(final_clip)
for clip in video_clips:
close_clip(clip)
# merge these two clips
merged_clip = concatenate_videoclips([base_clip, next_clip])
# save merged result to temp file
merged_clip.write_videofile(
filename=temp_merged_next,
threads=threads,
logger=None,
temp_audiofile_path=output_dir,
audio_codec=audio_codec,
fps=fps,
)
close_clip(base_clip)
close_clip(next_clip)
close_clip(merged_clip)
# replace base file with new merged file
delete_files(temp_merged_video)
os.rename(temp_merged_next, temp_merged_video)
except Exception as e:
logger.error(f"failed to merge clip: {str(e)}")
continue
# after merging, rename final result to target file name
os.rename(temp_merged_video, combined_video_path)
# clean temp files
clip_files = [clip.file_path for clip in processed_clips]
delete_files(clip_files)
logger.info("video combining completed")
logger.info("Video combining completed successfully")
except Exception as e:
logger.error(f"Error during final video concatenation: {str(e)}")
finally:
# Clean up temp files
clip_files = [clip.file_path for clip in processed_clips]
delete_files(clip_files)
return combined_video_path
def wrap_text(text, max_width, font="Arial", fontsize=60):
# Create ImageFont
font = ImageFont.truetype(font, fontsize)
@ -359,7 +470,6 @@ def wrap_text(text, max_width, font="Arial", fontsize=60):
height = len(_wrapped_lines_) * height
return result, height
def generate_video(
video_path: str,
audio_path: str,
@ -369,6 +479,7 @@ def generate_video(
):
aspect = VideoAspect(params.video_aspect)
video_width, video_height = aspect.to_resolution()
video_width, video_height = ensure_even_dimensions(video_width, video_height)
logger.info(f"generating video: {video_width} x {video_height}")
logger.info(f" ① video: {video_path}")
@ -410,8 +521,8 @@ def generate_video(
bg_color=params.text_background_color,
stroke_color=params.stroke_color,
stroke_width=params.stroke_width,
# interline=interline,
# size=size,
interline=interline,
size=size,
)
duration = subtitle_item[0][1] - subtitle_item[0][0]
_clip = _clip.with_start(subtitle_item[0][0])
@ -472,60 +583,227 @@ def generate_video(
logger.error(f"failed to add bgm: {str(e)}")
video_clip = video_clip.with_audio(audio_clip)
video_clip.write_videofile(
output_file,
audio_codec=audio_codec,
temp_audiofile_path=output_dir,
threads=params.n_threads or 2,
logger=None,
fps=fps,
)
video_clip.close()
del video_clip
# Use improved encoding settings
try:
# Get optimized encoding parameters
encoding_params = get_optimal_encoding_params(video_width, video_height, "video")
ffmpeg_params = get_standard_ffmpeg_params(video_width, video_height, "video")
# For Windows, use a simpler approach to avoid path issues with two-pass encoding
if os.name == 'nt':
# Single pass with high quality settings
video_clip.write_videofile(
output_file,
codec=video_codec,
audio_codec=audio_codec,
temp_audiofile_path=output_dir,
threads=params.n_threads or 2,
logger=None,
fps=fps,
ffmpeg_params=ffmpeg_params
)
else:
# On Unix systems, we can use two-pass encoding more reliably
# Prepare a unique passlogfile name to avoid conflicts
passlog_id = str(uuid.uuid4())[:8]
passlogfile = os.path.join(output_dir, f"ffmpeg2pass_{passlog_id}")
# Create a temporary file for first pass output
temp_first_pass = os.path.join(output_dir, f"temp_first_pass_{passlog_id}.mp4")
# Flag to track if we should do second pass
do_second_pass = True
# First pass parameters with explicit passlogfile
first_pass_params = ffmpeg_params + [
"-pass", "1",
"-passlogfile", passlogfile,
"-an" # No audio in first pass
]
logger.info("Starting first pass encoding...")
try:
video_clip.write_videofile(
temp_first_pass, # Write to temporary file instead of null
codec=video_codec,
audio=False, # Skip audio processing in first pass
threads=params.n_threads or 2,
logger=None,
fps=fps,
ffmpeg_params=first_pass_params
)
except Exception as e:
# If first pass fails, fallback to single-pass encoding
logger.warning(f"First pass encoding failed: {e}. Falling back to single-pass encoding.")
video_clip.write_videofile(
output_file,
codec=video_codec,
audio_codec=audio_codec,
temp_audiofile_path=output_dir,
threads=params.n_threads or 2,
logger=None,
fps=fps,
ffmpeg_params=ffmpeg_params
)
do_second_pass = False
finally:
# Clean up first pass temporary file
if os.path.exists(temp_first_pass):
try:
os.remove(temp_first_pass)
except Exception as e:
logger.warning(f"Failed to delete temporary first pass file: {e}")
# Second pass only if first pass succeeded
if do_second_pass:
logger.info("Starting second pass encoding...")
second_pass_params = ffmpeg_params + [
"-pass", "2",
"-passlogfile", passlogfile
]
video_clip.write_videofile(
output_file,
codec=video_codec,
audio_codec=audio_codec,
temp_audiofile_path=output_dir,
threads=params.n_threads or 2,
logger=None,
fps=fps,
ffmpeg_params=second_pass_params
)
# Clean up pass log files
for f in glob.glob(f"{passlogfile}*"):
try:
os.remove(f)
except Exception as e:
logger.warning(f"Failed to delete pass log file {f}: {e}")
finally:
# Ensure all resources are properly closed
close_clip(video_clip)
close_clip(audio_clip)
if 'bgm_clip' in locals():
close_clip(bgm_clip)
# Force garbage collection
gc.collect()
def preprocess_video(materials: List[MaterialInfo], clip_duration=4):
def preprocess_video(materials: List[MaterialInfo], clip_duration=4, apply_denoising=False):
for material in materials:
if not material.url:
continue
ext = utils.parse_extension(material.url)
# First load the clip
try:
clip = VideoFileClip(material.url)
except Exception:
clip = ImageClip(material.url)
# Then apply denoising if needed and it's a video
if ext not in const.FILE_TYPE_IMAGES and apply_denoising:
# Apply subtle denoising to video clips that might benefit
from moviepy.video.fx.all import denoise
try:
# Get a sample frame to analyze noise level
frame = clip.get_frame(0)
import numpy as np
noise_estimate = np.std(frame)
# Apply denoising only if noise level seems high
if noise_estimate > 15: # Threshold determined empirically
logger.info(f"Applying denoising to video with estimated noise: {noise_estimate:.2f}")
clip = denoise(clip, sigma=1.5, mode="fast")
except Exception as e:
logger.warning(f"Denoising attempt failed: {e}")
width = clip.size[0]
height = clip.size[1]
if width < 480 or height < 480:
logger.warning(f"low resolution material: {width}x{height}, minimum 480x480 required")
continue
# Improved resolution check
min_resolution = 480
# Calculate aspect ratio outside of conditional blocks so it's always defined
aspect_ratio = width / height
if width < min_resolution or height < min_resolution:
logger.warning(f"Low resolution material: {width}x{height}, minimum {min_resolution}x{min_resolution} recommended")
# Instead of skipping, apply upscaling for very low-res content
if width < min_resolution/2 or height < min_resolution/2:
logger.warning("Resolution too low, skipping")
close_clip(clip)
continue
else:
# Apply high-quality upscaling for borderline content
logger.info(f"Applying high-quality upscaling to low-resolution content: {width}x{height}")
# Calculate target dimensions while maintaining aspect ratio
if width < height:
new_width = min_resolution
new_height = int(new_width / aspect_ratio)
else:
new_height = min_resolution
new_width = int(new_height * aspect_ratio)
# Ensure dimensions are even
new_width, new_height = ensure_even_dimensions(new_width, new_height)
# Use high-quality scaling
clip = clip.resized(new_size=(new_width, new_height), resizer='lanczos')
if ext in const.FILE_TYPE_IMAGES:
logger.info(f"processing image: {material.url}")
# Create an image clip and set its duration to 3 seconds
# Ensure dimensions are even numbers and enhance for better quality
width, height = ensure_even_dimensions(width, height)
# Use higher resolution multiplier for sharper output
quality_multiplier = 1.2 if width < 1080 else 1.0
enhanced_width = int(width * quality_multiplier)
enhanced_height = int(height * quality_multiplier)
enhanced_width, enhanced_height = ensure_even_dimensions(enhanced_width, enhanced_height)
# Close the original clip before creating a new one to avoid file handle conflicts
close_clip(clip)
# Create a new ImageClip with the image
clip = (
ImageClip(material.url)
.resized(new_size=(enhanced_width, enhanced_height), resizer='bicubic') # Use bicubic for better quality
.with_duration(clip_duration)
.with_position("center")
)
# Apply a zoom effect using the resize method.
# A lambda function is used to make the zoom effect dynamic over time.
# The zoom effect starts from the original size and gradually scales up to 120%.
# t represents the current time, and clip.duration is the total duration of the clip (3 seconds).
# Note: 1 represents 100% size, so 1.2 represents 120% size.
# More subtle and smoother zoom effect
zoom_clip = clip.resized(
lambda t: 1 + (clip_duration * 0.03) * (t / clip.duration)
lambda t: 1 + (0.05 * (t / clip.duration)), # Reduced zoom from 0.1 to 0.05 for smoother effect
resizer='lanczos' # Ensure high-quality scaling
)
# Optionally, create a composite video clip containing the zoomed clip.
# This is useful when you want to add other elements to the video.
# Create composite with enhanced quality
final_clip = CompositeVideoClip([zoom_clip])
# Output the video to a file.
# Output with maximum quality settings
video_file = f"{material.url}.mp4"
final_clip.write_videofile(video_file, fps=30, logger=None)
encoding_params = get_optimal_encoding_params(enhanced_width, enhanced_height, "image")
final_clip.write_videofile(video_file,
fps=fps,
logger='bar',
codec=video_codec,
# Remove bitrate parameter as it conflicts with CRF in ffmpeg_params
ffmpeg_params=get_standard_ffmpeg_params(enhanced_width, enhanced_height, "image"),
write_logfile=False,
verbose=False
)
# Close all clips to properly release resources
close_clip(final_clip)
close_clip(zoom_clip)
close_clip(clip)
material.url = video_file
logger.success(f"image processed: {video_file}")
logger.success(f"high-quality image processed: {video_file}")
else:
close_clip(clip)
return materials

View File

@ -1469,7 +1469,7 @@ def create_subtitle(sub_maker: submaker.SubMaker, text: str, subtitle_file: str)
with open(subtitle_file, "w", encoding="utf-8") as file:
file.write("\n".join(sub_items) + "\n")
try:
sbs = subtitles.file_to_subtitles(subtitle_file, encoding="utf-8")
sbs = subtitles.file_to_subtitles(subtitle_file)
duration = max([tb for ((ta, tb), txt) in sbs])
logger.info(
f"completed, subtitle file created: {subtitle_file}, duration: {duration}"

View File

@ -1,10 +1,11 @@
moviepy==2.1.2
Pillow
streamlit==1.45.0
edge_tts==6.1.19
fastapi==0.115.6
uvicorn==0.32.1
openai==1.56.1
faster-whisper==1.1.0
faster-whisper
loguru==0.7.3
google.generativeai==0.8.3
dashscope==1.20.14
@ -14,3 +15,5 @@ redis==5.2.0
python-multipart==0.0.19
pyyaml
requests>=2.31.0
numpy
shutil