diff --git a/README.md b/README.md index 9cd88c1..8e269b9 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,65 @@ -## AvHub - Adult Video Content Resource Toolkit -**AvHub** is an open-source tool dedicated to the retrieval and management of adult video resources, offering three core features: +# AvHub - Adult Video Resource Management Platform + +**AvHub** is a web platform dedicated to the retrieval and management of adult video resources. + +Cloudflare Page: https://avhub.pages.dev/ + +Vercel Page: https://avhub.vercel.app/ + +**** + +[![GitHub license](https://img.shields.io/github/license/levywang/avhub?label=License&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github") +[![Release Version](https://img.shields.io/github/release/levywang/avhub?include_prereleases&label=Release&logo=github)](https://github.com/levywang/avhub/releases/latest "Click to view the repo on Github") +[![GitHub Star](https://img.shields.io/github/stars/levywang/avhub?label=Stars&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github") +[![GitHub Fork](https://img.shields.io/github/forks/levywang/avhub?label=Forks&logo=github)](https://github.com/levywang/avhub/forks?include=active%2Carchived%2Cinactive%2Cnetwork&page=1&period=2y&sort_by=stargazer_counts "Click to view the repo on Github") +[![Repo Size](https://img.shields.io/github/repo-size/levywang/avhub?label=Size&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github") +[![GitHub Issue](https://img.shields.io/github/issues-closed-raw/levywang/avhub?label=Closed%20Issue&logo=github)](https://github.com/levywang/avhub/issues?q=is%3Aissue+is%3Aclosed "Click to view the repo on Github") + +[![Docker Stars](https://img.shields.io/docker/stars/levywang/avhub?label=Stars&logo=docker)](https://hub.docker.com/r/levywang/avhub "Click to view the image on Docker Hub") +[![Docker Pulls](https://img.shields.io/docker/pulls/levywang/avhub?label=Pulls&logo=docker)](https://hub.docker.com/r/levywang/avhub "Click to view the image on Docker Hub") + +## Star History + +[![Star History Chart](https://api.star-history.com/svg?repos=levywang/avhub&type=Date)](https://star-history.com/#levywang/avhub&Date) --- -### **Key Features** -● 🔗 **Magnet Link Search**: Accurately finds magnet links corresponding to video codes. -● 📅 **Timely Hentai Resource Updates**: Automatically updates and archives monthly hentai resources. -● 📊 **Random Video Recommendation**: Random playback functionality based on crawled data. -● 🌐 **Multi-language Support**: Supports multiple language interfaces to meet global user needs. -● 🎨 **Multiple Theme Options**: Offers various theme color schemes to enhance user experience. +### **Core Features** +● 🔗 **Magnet Link Search by Video Code** +  Accurately find magnet links and cover images corresponding to video codes. +● 📅 **Timely Hentai Resource Updates** +  Automatically update and archive monthly hentai resources. +● 📊 **Random Video Recommendation** +  Random playback functionality based on crawled data. +● 🌐 **Multi-language Support** +  Supports multiple language interfaces to meet global user needs. +● 🎨 **Multiple Theme Options** +  Offers various theme color schemes to enhance user experience. + +--- + +## Getting Started + +### Run Locally +```bash +git clone https://github.com/levywang/avhub.git +cd avhub +pip install -r requirements.txt +python main.py +``` +The default API address: `http://127.0.0.1:8000/` + +You can configure a reverse proxy and domain, replacing `BASE_URL` in line 52 of `web/script.js`. + +The backend configuration file is located in `data/config.yaml`. Modify it according to your actual needs. + +### Docker Deployment +**Note: Python Version >= 3.7** +```bash +git clone https://github.com/levywang/avhub.git +cd avhub +docker run -d -p :80 -v $PWD:/app --name avhub levywang/avhub:latest +``` --- @@ -19,7 +70,16 @@ • **Backend**: • Developed using **FastAPI**, a Python framework, to provide efficient and stable API services. • **Privacy Protection**: - Strictly adheres to privacy principles and does not directly host any resource files. All data is retrieved through third-party links. + • Strictly adheres to privacy principles and does not directly host any resource files. All data is retrieved through third-party links. + +--- + +### **Data Sources** +• **Magnet Links and Cover Images**: Sourced from **missav**. +• **Hentai Resources**: Sourced from **hacg liuli**. +• **Random Video Recommendations**: Sourced from crawled data stored in the local file `/data/video_urls.txt`. + +The above data sources are configured in `data/config.yaml`. If the data sources change or become inaccessible, modifications and maintenance are required. --- @@ -28,7 +88,5 @@ Users must comply with the laws and regulations of their respective regions. AvH --- -**AvHub** is built with modern web technologies, aiming to provide users with an efficient and convenient adult video resource management experience. - -### **License** -This project is provided under a **Apache License 2.0** license that can be found in the [LICENSE](LICENSE) file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. \ No newline at end of file +### **License** +This project is provided under an **Apache License 2.0** license that can be found in the [LICENSE](LICENSE) file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. \ No newline at end of file diff --git a/README_CN.md b/README_CN.md index d69363c..57bced8 100644 --- a/README_CN.md +++ b/README_CN.md @@ -1,26 +1,88 @@ -## AvHub - 成人影视资源管理工具 +# AvHub - 成人影视资源管理平台 -**AvHub** 是一款专注成人影视资源检索与管理的开源工具,主要提供三大核心功能: +**AvHub** 是一款专注成人影视资源检索与管理的Web平台 + +Cloudflare Page: https://avhub.pages.dev/ + +Vercel Page: https://avhub.vercel.app/ + +**** + +[![GitHub license](https://img.shields.io/github/license/levywang/avhub?label=License&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github") +[![Release Version](https://img.shields.io/github/release/levywang/avhub?include_prereleases&label=Release&logo=github)](https://github.com/levywang/avhub/releases/latest "Click to view the repo on Github") +[![GitHub Star](https://img.shields.io/github/stars/levywang/avhub?label=Stars&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github") +[![GitHub Fork](https://img.shields.io/github/forks/levywang/avhub?label=Forks&logo=github)](https://github.com/levywang/avhub/forks?include=active%2Carchived%2Cinactive%2Cnetwork&page=1&period=2y&sort_by=stargazer_counts "Click to view the repo on Github") +[![Repo Size](https://img.shields.io/github/repo-size/levywang/avhub?label=Size&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github") +[![GitHub Issue](https://img.shields.io/github/issues-closed-raw/levywang/avhub?label=Closed%20Issue&logo=github)](https://github.com/levywang/avhub/issues?q=is%3Aissue+is%3Aclosed "Click to view the repo on Github") + +[![Docker Stars](https://img.shields.io/docker/stars/levywang/avhub?label=Stars&logo=docker)](https://hub.docker.com/r/levywang/avhub "Click to view the image on Docker Hub") +[![Docker Pulls](https://img.shields.io/docker/pulls/levywang/avhub?label=Pulls&logo=docker)](https://hub.docker.com/r/levywang/avhub "Click to view the image on Docker Hub") + + +## Star History + +[![Star History Chart](https://api.star-history.com/svg?repos=levywang/avhub&type=Date)](https://star-history.com/#levywang/avhub&Date) --- ### **核心特性** -● 🔗 **番号磁力链搜索**:精准查找番号对应的磁力链接。 -● 📅 **里番资源定时内容更新追踪**:自动更新并归档月度里番资源。 -● 📊 **随机视频推荐**:基于爬虫数据的随机播放功能。 -● 🌐 **多语言支持**:支持多种语言界面,满足全球用户需求。 -● 🎨 **多种主题配色切换**:提供多种主题配色,提升用户体验。 +● 🔗 **番号磁力链搜索** +  精准查找番号对应的磁力链接和封面图 +● 📅 **里番资源定时内容更新追踪** +  自动更新并归档月度里番资源 +● 📊 **随机视频推荐** +  基于爬虫数据的随机播放功能 +● 🌐 **多语言支持** +  支持多种语言界面,满足全球用户需求 +● 🎨 **多种主题配色切换** +  提供多种主题配色,提升用户体验 + +--- + +## Getting Started + +### 本地运行 +```bash +git clone https://github.com/levywang/avhub.git +cd avhub +pip install -r requirements.txt +python main.py +``` +默认运行的API地址:`http://127.0.0.1:8000/` + +可以配置反代和域名,替换 `web/script.js` 52行中的 `BASE_URL` + +后端运行的配置文件在 `data/config.yaml` 中,请根据实际情况修改 + + +### Docker 部署 +**注意:Python Version >= 3.7** +```bash +git clone https://github.com/levywang/avhub.git +cd avhub +docker run -d -p :80 -v $PWD:/app --name avhub levywang/avhub:latest +``` --- ### **技术栈** -• **前端**: - • 使用 **Tailwind CSS** 构建现代化、响应式界面。 - • 集成 **hls.js** 实现流畅的视频播放体验。 -• **后端**: - • 基于 **Python** 的 **FastAPI** 框架开发,提供高效、稳定的 API 服务。 -• **隐私保护**: - 严格遵循隐私保护原则,不直接托管任何资源文件,所有数据均通过第三方链接获取。 +- **前端**: + - 使用 **Tailwind CSS** 构建现代化、响应式界面。 + - 集成 **hls.js** 实现流畅的视频播放体验。 +- **后端**: + - 基于 **Python** 的 **FastAPI** 框架开发,提供高效、稳定的 API 服务。 +- **隐私保护**: + - 严格遵循隐私保护原则,不直接托管任何资源文件,所有数据均通过第三方链接获取。 + +--- + +### **数据源** +- **番号磁力链和封面图**:来源于 **missav** +- **里番资源**:来源于 **hacg 琉璃神社** +- **随机视频推荐**:来源于到的爬虫数据,存储在本地文件 `/data/video_urls.txt` + +以上数据源均配置在 `data/config.yaml` 中,如果数据源变更或者无法访问,需要进行修改和维护 + --- @@ -29,7 +91,5 @@ --- -**AvHub** 基于现代 Web 技术开发,致力于为用户提供高效、便捷的成人影视资源管理体验。 - ### **License** This project is provided under a **Apache License 2.0** license that can be found in the [LICENSE](LICENSE) file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. diff --git a/data/config.yaml b/data/config.yaml index e1ca6e2..0f8639a 100644 --- a/data/config.yaml +++ b/data/config.yaml @@ -11,10 +11,10 @@ files: av_spider: source_url: "https://missav.ai/cn/search/" - proxy_url: "http://192.168.50.2:7890" # http or socks5 proxy + proxy_url: "http://192.168.50.3:7890" # http or socks5 proxy hacg_spider: - hacg_url: "" + source_url: "https://www.hacg.mov/wp/" logging: log_file: "main.log" diff --git a/main.py b/main.py index 34f3c78..3846036 100644 --- a/main.py +++ b/main.py @@ -1,11 +1,7 @@ -import sys -import argparse +# -*- encoding: utf-8 -*- import os -import re -import subprocess import requests import json -import ast from bs4 import BeautifulSoup from typing import Union from fastapi import FastAPI @@ -16,6 +12,8 @@ import random from utils.spider import * import hydra from utils.logger import setup_logger +import schedule +import time @hydra.main(config_path='data/', config_name='config', version_base=None) def main(cfg: DictConfig): @@ -60,7 +58,7 @@ def main(cfg: DictConfig): return None return random.choice(webp_links or links) except Exception as e: - logger.error(f"获取图片URL失败: {str(e)}") + logger.error(f"Failed to obtain the image URL: {str(e)}") return None def read_random_line(file_path: str) -> tuple[str, str]: @@ -126,12 +124,29 @@ def main(cfg: DictConfig): except Exception as e: logger.error(f"Failed to fetch random video URL: {str(e)}") raise HTTPException(status_code=500, detail=str(e)) + + def run_hacg_spider(): + hacg_spider = HacgSpider(url=cfg.hacg_spider.source_url, filepath=cfg.files.hacg_json_path, cfg=cfg) + hacg_spider.update_json_file() + logger.info("HacgSpider task completed.") + + # Schedule the HacgSpider task to run daily at 1 AM + schedule.every().day.at("01:00").do(run_hacg_spider) + + # Function to keep running the scheduler in the background + def run_scheduler(): + while True: + schedule.run_pending() + time.sleep(60) # Check every minute + + import threading + # Start the scheduler in a separate thread + scheduler_thread = threading.Thread(target=run_scheduler) + scheduler_thread.daemon = True + scheduler_thread.start() import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000) if __name__ == "__main__": - main() - - - + main() \ No newline at end of file diff --git a/utils/logger.py b/utils/logger.py index 00abb46..82ed944 100644 --- a/utils/logger.py +++ b/utils/logger.py @@ -7,7 +7,7 @@ def setup_logger(cfg: DictConfig): logger.setLevel(getattr(logging, cfg.logging.level.upper())) # 创建文件处理器和流处理器 - file_handler = logging.FileHandler(cfg.logging.log_file) + file_handler = logging.FileHandler(cfg.logging.log_file, encoding='utf-8') stream_handler = logging.StreamHandler() # 设置日志格式 diff --git a/utils/spider.py b/utils/spider.py index 3547a3f..9d0f65e 100644 --- a/utils/spider.py +++ b/utils/spider.py @@ -1,3 +1,4 @@ +# -*- encoding: utf-8 -*- import re import json import os @@ -40,7 +41,7 @@ class AVSpider: response = requests.get(url, proxies=self.proxies, headers=self.headers) response.raise_for_status() except requests.RequestException as e: - self.logger.error(f"请求失败: {e}") + self.logger.error(f"Request Error: {e}") return [] html_content = response.text @@ -70,7 +71,7 @@ class AVSpider: response = requests.get(link, proxies=self.proxies, headers=self.headers) response.raise_for_status() except requests.RequestException as e: - self.logger.error(f"请求失败: {e}") + self.logger.error(f"Request Error: {e}") return [] html_content = response.text @@ -115,7 +116,7 @@ class HacgSpider: response = requests.get(self.url) response.raise_for_status() except requests.RequestException as e: - self.logger.error(f"请求失败: {e}") + self.logger.error(f"Request Error: {e}") return None html_content = response.text @@ -133,12 +134,12 @@ class HacgSpider: return pages def get_links(self, page): - url = f'{self.url}?page={page}&s=%E5%90%88%E9%9B%86&submit=%E6%90%9C%E7%B4%A2' + url = f'{self.url}page/{page}?s=%E5%90%88%E9%9B%86&submit=%E6%90%9C%E7%B4%A2' try: response = requests.get(url) response.raise_for_status() except requests.RequestException as e: - self.logger.error(f"请求失败: {e}") + self.logger.error(f"Request Error: {e}") return {} html_content = response.text @@ -157,7 +158,7 @@ class HacgSpider: response = requests.get(link) response.raise_for_status() except requests.RequestException as e: - self.logger.error(f"请求失败: {e}") + self.logger.error(f"Request Error: {e}") continue content = response.text @@ -165,7 +166,7 @@ class HacgSpider: if matches: magnet_links[title] = f'magnet:?xt=urn:btih:{matches[0]}' - self.logger.info(f"Magnet links extracted from page {page}: {magnet_links}") + self.logger.info(f"Magnet links extracted from page {page}") return magnet_links @@ -174,7 +175,7 @@ class HacgSpider: results = {} total_pages = self.get_pages() if total_pages is None: - self.logger.error("无法获取总页数") + self.logger.error("Unable to get total") return for i in range(1, total_pages + 1): @@ -187,7 +188,7 @@ class HacgSpider: total_pages = self.get_pages() if total_pages is None: - self.logger.error("无法获取总页数") + self.logger.error("Unable to get total") return for i in range(1, total_pages + 1): @@ -204,10 +205,10 @@ class HacgSpider: self.logger.info(f'Page {i} processed (Incremental Update)') if all_exists: - self.logger.info(f"第 {i} 页数据已存在于 JSON 文件中,停止更新") + self.logger.info(f"Page {i} data already exists in the JSON file, stop updating") break with open(self.filepath, 'w', encoding='utf-8') as file: json.dump(results, file, ensure_ascii=False, indent=4) - self.logger.info("JSON文件已更新") \ No newline at end of file + self.logger.info("JSON file updated") \ No newline at end of file