first commit

This commit is contained in:
levywang 2025-03-11 15:05:12 +08:00
parent 80a271c6e6
commit 6d9ea5bad3
6 changed files with 187 additions and 53 deletions

View File

@ -1,14 +1,65 @@
## AvHub - Adult Video Content Resource Toolkit
**AvHub** is an open-source tool dedicated to the retrieval and management of adult video resources, offering three core features:
# AvHub - Adult Video Resource Management Platform
**AvHub** is a web platform dedicated to the retrieval and management of adult video resources.
Cloudflare Page: https://avhub.pages.dev/
Vercel Page: https://avhub.vercel.app/
****
[![GitHub license](https://img.shields.io/github/license/levywang/avhub?label=License&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github")
[![Release Version](https://img.shields.io/github/release/levywang/avhub?include_prereleases&label=Release&logo=github)](https://github.com/levywang/avhub/releases/latest "Click to view the repo on Github")
[![GitHub Star](https://img.shields.io/github/stars/levywang/avhub?label=Stars&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github")
[![GitHub Fork](https://img.shields.io/github/forks/levywang/avhub?label=Forks&logo=github)](https://github.com/levywang/avhub/forks?include=active%2Carchived%2Cinactive%2Cnetwork&page=1&period=2y&sort_by=stargazer_counts "Click to view the repo on Github")
[![Repo Size](https://img.shields.io/github/repo-size/levywang/avhub?label=Size&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github")
[![GitHub Issue](https://img.shields.io/github/issues-closed-raw/levywang/avhub?label=Closed%20Issue&logo=github)](https://github.com/levywang/avhub/issues?q=is%3Aissue+is%3Aclosed "Click to view the repo on Github")
[![Docker Stars](https://img.shields.io/docker/stars/levywang/avhub?label=Stars&logo=docker)](https://hub.docker.com/r/levywang/avhub "Click to view the image on Docker Hub")
[![Docker Pulls](https://img.shields.io/docker/pulls/levywang/avhub?label=Pulls&logo=docker)](https://hub.docker.com/r/levywang/avhub "Click to view the image on Docker Hub")
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=levywang/avhub&type=Date)](https://star-history.com/#levywang/avhub&Date)
---
### **Key Features**
● 🔗 **Magnet Link Search**: Accurately finds magnet links corresponding to video codes.
● 📅 **Timely Hentai Resource Updates**: Automatically updates and archives monthly hentai resources.
● 📊 **Random Video Recommendation**: Random playback functionality based on crawled data.
● 🌐 **Multi-language Support**: Supports multiple language interfaces to meet global user needs.
● 🎨 **Multiple Theme Options**: Offers various theme color schemes to enhance user experience.
### **Core Features**
● 🔗 **Magnet Link Search by Video Code**
 Accurately find magnet links and cover images corresponding to video codes.
● 📅 **Timely Hentai Resource Updates**
 Automatically update and archive monthly hentai resources.
● 📊 **Random Video Recommendation**
 Random playback functionality based on crawled data.
● 🌐 **Multi-language Support**
 Supports multiple language interfaces to meet global user needs.
● 🎨 **Multiple Theme Options**
 Offers various theme color schemes to enhance user experience.
---
## Getting Started
### Run Locally
```bash
git clone https://github.com/levywang/avhub.git
cd avhub
pip install -r requirements.txt
python main.py
```
The default API address: `http://127.0.0.1:8000/`
You can configure a reverse proxy and domain, replacing `BASE_URL` in line 52 of `web/script.js`.
The backend configuration file is located in `data/config.yaml`. Modify it according to your actual needs.
### Docker Deployment
**Note: Python Version >= 3.7**
```bash
git clone https://github.com/levywang/avhub.git
cd avhub
docker run -d -p <your_server_port>:80 -v $PWD:/app --name avhub levywang/avhub:latest
```
---
@ -19,7 +70,16 @@
**Backend**:
• Developed using **FastAPI**, a Python framework, to provide efficient and stable API services.
**Privacy Protection**:
Strictly adheres to privacy principles and does not directly host any resource files. All data is retrieved through third-party links.
• Strictly adheres to privacy principles and does not directly host any resource files. All data is retrieved through third-party links.
---
### **Data Sources**
**Magnet Links and Cover Images**: Sourced from **missav**.
**Hentai Resources**: Sourced from **hacg liuli**.
**Random Video Recommendations**: Sourced from crawled data stored in the local file `/data/video_urls.txt`.
The above data sources are configured in `data/config.yaml`. If the data sources change or become inaccessible, modifications and maintenance are required.
---
@ -28,7 +88,5 @@ Users must comply with the laws and regulations of their respective regions. AvH
---
**AvHub** is built with modern web technologies, aiming to provide users with an efficient and convenient adult video resource management experience.
### **License**
This project is provided under a **Apache License 2.0** license that can be found in the [LICENSE](LICENSE) file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license.
### **License**
This project is provided under an **Apache License 2.0** license that can be found in the [LICENSE](LICENSE) file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license.

View File

@ -1,26 +1,88 @@
## AvHub - 成人影视资源管理工具
# AvHub - 成人影视资源管理平台
**AvHub** 是一款专注成人影视资源检索与管理的开源工具,主要提供三大核心功能:
**AvHub** 是一款专注成人影视资源检索与管理的Web平台
Cloudflare Page: https://avhub.pages.dev/
Vercel Page: https://avhub.vercel.app/
****
[![GitHub license](https://img.shields.io/github/license/levywang/avhub?label=License&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github")
[![Release Version](https://img.shields.io/github/release/levywang/avhub?include_prereleases&label=Release&logo=github)](https://github.com/levywang/avhub/releases/latest "Click to view the repo on Github")
[![GitHub Star](https://img.shields.io/github/stars/levywang/avhub?label=Stars&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github")
[![GitHub Fork](https://img.shields.io/github/forks/levywang/avhub?label=Forks&logo=github)](https://github.com/levywang/avhub/forks?include=active%2Carchived%2Cinactive%2Cnetwork&page=1&period=2y&sort_by=stargazer_counts "Click to view the repo on Github")
[![Repo Size](https://img.shields.io/github/repo-size/levywang/avhub?label=Size&logo=github)](https://github.com/levywang/avhub "Click to view the repo on Github")
[![GitHub Issue](https://img.shields.io/github/issues-closed-raw/levywang/avhub?label=Closed%20Issue&logo=github)](https://github.com/levywang/avhub/issues?q=is%3Aissue+is%3Aclosed "Click to view the repo on Github")
[![Docker Stars](https://img.shields.io/docker/stars/levywang/avhub?label=Stars&logo=docker)](https://hub.docker.com/r/levywang/avhub "Click to view the image on Docker Hub")
[![Docker Pulls](https://img.shields.io/docker/pulls/levywang/avhub?label=Pulls&logo=docker)](https://hub.docker.com/r/levywang/avhub "Click to view the image on Docker Hub")
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=levywang/avhub&type=Date)](https://star-history.com/#levywang/avhub&Date)
---
### **核心特性**
● 🔗 **番号磁力链搜索**:精准查找番号对应的磁力链接。
● 📅 **里番资源定时内容更新追踪**:自动更新并归档月度里番资源。
● 📊 **随机视频推荐**:基于爬虫数据的随机播放功能。
● 🌐 **多语言支持**:支持多种语言界面,满足全球用户需求。
● 🎨 **多种主题配色切换**:提供多种主题配色,提升用户体验。
● 🔗 **番号磁力链搜索**
&emsp;精准查找番号对应的磁力链接和封面图
● 📅 **里番资源定时内容更新追踪**
&emsp;自动更新并归档月度里番资源
● 📊 **随机视频推荐**
&emsp;基于爬虫数据的随机播放功能
● 🌐 **多语言支持**
&emsp;支持多种语言界面,满足全球用户需求
● 🎨 **多种主题配色切换**
&emsp;提供多种主题配色,提升用户体验
---
## Getting Started
### 本地运行
```bash
git clone https://github.com/levywang/avhub.git
cd avhub
pip install -r requirements.txt
python main.py
```
默认运行的API地址`http://127.0.0.1:8000/`
可以配置反代和域名,替换 `web/script.js` 52行中的 `BASE_URL`
后端运行的配置文件在 `data/config.yaml` 中,请根据实际情况修改
### Docker 部署
**注意Python Version >= 3.7**
```bash
git clone https://github.com/levywang/avhub.git
cd avhub
docker run -d -p <your_server_port>:80 -v $PWD:/app --name avhub levywang/avhub:latest
```
---
### **技术栈**
**前端**
• 使用 **Tailwind CSS** 构建现代化、响应式界面。
• 集成 **hls.js** 实现流畅的视频播放体验。
**后端**
• 基于 **Python****FastAPI** 框架开发,提供高效、稳定的 API 服务。
**隐私保护**
严格遵循隐私保护原则,不直接托管任何资源文件,所有数据均通过第三方链接获取。
- **前端**
- 使用 **Tailwind CSS** 构建现代化、响应式界面。
- 集成 **hls.js** 实现流畅的视频播放体验。
- **后端**
- 基于 **Python****FastAPI** 框架开发,提供高效、稳定的 API 服务。
- **隐私保护**
- 严格遵循隐私保护原则,不直接托管任何资源文件,所有数据均通过第三方链接获取。
---
### **数据源**
- **番号磁力链和封面图**:来源于 **missav**
- **里番资源**:来源于 **hacg 琉璃神社**
- **随机视频推荐**:来源于到的爬虫数据,存储在本地文件 `/data/video_urls.txt`
以上数据源均配置在 `data/config.yaml` 中,如果数据源变更或者无法访问,需要进行修改和维护
---
@ -29,7 +91,5 @@
---
**AvHub** 基于现代 Web 技术开发,致力于为用户提供高效、便捷的成人影视资源管理体验。
### **License**
This project is provided under a **Apache License 2.0** license that can be found in the [LICENSE](LICENSE) file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license.

View File

@ -11,10 +11,10 @@ files:
av_spider:
source_url: "https://missav.ai/cn/search/"
proxy_url: "http://192.168.50.2:7890" # http or socks5 proxy
proxy_url: "http://192.168.50.3:7890" # http or socks5 proxy
hacg_spider:
hacg_url: ""
source_url: "https://www.hacg.mov/wp/"
logging:
log_file: "main.log"

35
main.py
View File

@ -1,11 +1,7 @@
import sys
import argparse
# -*- encoding: utf-8 -*-
import os
import re
import subprocess
import requests
import json
import ast
from bs4 import BeautifulSoup
from typing import Union
from fastapi import FastAPI
@ -16,6 +12,8 @@ import random
from utils.spider import *
import hydra
from utils.logger import setup_logger
import schedule
import time
@hydra.main(config_path='data/', config_name='config', version_base=None)
def main(cfg: DictConfig):
@ -60,7 +58,7 @@ def main(cfg: DictConfig):
return None
return random.choice(webp_links or links)
except Exception as e:
logger.error(f"获取图片URL失败: {str(e)}")
logger.error(f"Failed to obtain the image URL: {str(e)}")
return None
def read_random_line(file_path: str) -> tuple[str, str]:
@ -126,12 +124,29 @@ def main(cfg: DictConfig):
except Exception as e:
logger.error(f"Failed to fetch random video URL: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
def run_hacg_spider():
hacg_spider = HacgSpider(url=cfg.hacg_spider.source_url, filepath=cfg.files.hacg_json_path, cfg=cfg)
hacg_spider.update_json_file()
logger.info("HacgSpider task completed.")
# Schedule the HacgSpider task to run daily at 1 AM
schedule.every().day.at("01:00").do(run_hacg_spider)
# Function to keep running the scheduler in the background
def run_scheduler():
while True:
schedule.run_pending()
time.sleep(60) # Check every minute
import threading
# Start the scheduler in a separate thread
scheduler_thread = threading.Thread(target=run_scheduler)
scheduler_thread.daemon = True
scheduler_thread.start()
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
if __name__ == "__main__":
main()
main()

View File

@ -7,7 +7,7 @@ def setup_logger(cfg: DictConfig):
logger.setLevel(getattr(logging, cfg.logging.level.upper()))
# 创建文件处理器和流处理器
file_handler = logging.FileHandler(cfg.logging.log_file)
file_handler = logging.FileHandler(cfg.logging.log_file, encoding='utf-8')
stream_handler = logging.StreamHandler()
# 设置日志格式

View File

@ -1,3 +1,4 @@
# -*- encoding: utf-8 -*-
import re
import json
import os
@ -40,7 +41,7 @@ class AVSpider:
response = requests.get(url, proxies=self.proxies, headers=self.headers)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
self.logger.error(f"Request Error: {e}")
return []
html_content = response.text
@ -70,7 +71,7 @@ class AVSpider:
response = requests.get(link, proxies=self.proxies, headers=self.headers)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
self.logger.error(f"Request Error: {e}")
return []
html_content = response.text
@ -115,7 +116,7 @@ class HacgSpider:
response = requests.get(self.url)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
self.logger.error(f"Request Error: {e}")
return None
html_content = response.text
@ -133,12 +134,12 @@ class HacgSpider:
return pages
def get_links(self, page):
url = f'{self.url}?page={page}&s=%E5%90%88%E9%9B%86&submit=%E6%90%9C%E7%B4%A2'
url = f'{self.url}page/{page}?s=%E5%90%88%E9%9B%86&submit=%E6%90%9C%E7%B4%A2'
try:
response = requests.get(url)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
self.logger.error(f"Request Error: {e}")
return {}
html_content = response.text
@ -157,7 +158,7 @@ class HacgSpider:
response = requests.get(link)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
self.logger.error(f"Request Error: {e}")
continue
content = response.text
@ -165,7 +166,7 @@ class HacgSpider:
if matches:
magnet_links[title] = f'magnet:?xt=urn:btih:{matches[0]}'
self.logger.info(f"Magnet links extracted from page {page}: {magnet_links}")
self.logger.info(f"Magnet links extracted from page {page}")
return magnet_links
@ -174,7 +175,7 @@ class HacgSpider:
results = {}
total_pages = self.get_pages()
if total_pages is None:
self.logger.error("无法获取总页数")
self.logger.error("Unable to get total")
return
for i in range(1, total_pages + 1):
@ -187,7 +188,7 @@ class HacgSpider:
total_pages = self.get_pages()
if total_pages is None:
self.logger.error("无法获取总页数")
self.logger.error("Unable to get total")
return
for i in range(1, total_pages + 1):
@ -204,10 +205,10 @@ class HacgSpider:
self.logger.info(f'Page {i} processed (Incremental Update)')
if all_exists:
self.logger.info(f"{i} 页数据已存在于 JSON 文件中,停止更新")
self.logger.info(f"Page {i} data already exists in the JSON file, stop updating")
break
with open(self.filepath, 'w', encoding='utf-8') as file:
json.dump(results, file, ensure_ascii=False, indent=4)
self.logger.info("JSON文件已更新")
self.logger.info("JSON file updated")