first commit

This commit is contained in:
levywang 2025-03-11 11:14:17 +08:00
parent 8d5ca409f1
commit f11e497f52
15 changed files with 3772 additions and 24 deletions

17
Dockerfile Normal file
View File

@ -0,0 +1,17 @@
FROM python:3.13-slim
# 设置工作目录
WORKDIR /app
# 安装依赖
RUN apt-get update && apt-get install -y --no-install-recommends nginx
RUN pip install --no-cache-dir beautifulsoup4 fastapi requests uvicorn hydra-core curl_cffi
# 复制应用代码
COPY . /app
RUN rm -rf /etc/nginx/sites-enabled/default && cp /app/nginx.example.conf /etc/nginx/sites-enabled/default
CMD ["sh", "-c", "python3 main.py & nginx -g 'daemon off;'"]
EXPOSE 80
EXPOSE 8000

View File

@ -1,19 +1,34 @@
## AvHub - Adult Video Content Resource Toolkit
**AvHub** is an open-source tool dedicated to the retrieval and management of adult video resources, offering three core features:
AvHub is an open-source tool designed to help users efficiently search and organize adult video content through magnet links. The project focuses on two core functionalities:
---
1. **AV Code Search Engine**
Quickly locate magnet links using Japanese adult video identification codes (番号) through integrated search capabilities.
### **Key Features**
● 🔗 **Magnet Link Search**: Accurately finds magnet links corresponding to video codes.
● 📅 **Timely Hentai Resource Updates**: Automatically updates and archives monthly hentai resources.
● 📊 **Random Video Recommendation**: Random playback functionality based on crawled data.
● 🌐 **Multi-language Support**: Supports multiple language interfaces to meet global user needs.
● 🎨 **Multiple Theme Options**: Offers various theme color schemes to enhance user experience.
2. **Monthly H-Anime Collections**
Automatically curated monthly compilations of adult anime content, providing organized access to the latest releases.
---
**Key Features**
* 🔍 Cross-platform search optimization
* 📅 Scheduled content updates tracker
* 🔗 Magnet link verification system
* 📊 Community-driven content database
### **Technology Stack**
**Frontend**:
• Built with **Tailwind CSS** for a modern, responsive interface.
• Integrated with **hls.js** for smooth video playback.
**Backend**:
• Developed using **FastAPI**, a Python framework, to provide efficient and stable API services.
**Privacy Protection**:
Strictly adheres to privacy principles and does not directly host any resource files. All data is retrieved through third-party links.
Built with modern web technologies, AvHub emphasizes user privacy and does not host any content directly. The project welcomes community contributions to improve search algorithms and content curation systems while adhering to open-source principles.
---
*Note: Users are responsible for checking local laws regarding adult content access.*
### **Legal Disclaimer**
Users must comply with the laws and regulations of their respective regions. AvHub is solely a resource retrieval tool and does not involve the distribution or storage of any resources.
---
**AvHub** is built with modern web technologies, aiming to provide users with an efficient and convenient adult video resource management experience.
### **License**
This project is provided under a **Apache License 2.0** license that can be found in the [LICENSE](LICENSE) file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license.

View File

@ -1,19 +1,35 @@
## AvHub - 成人影视资源管理工具
AvHub 是一款专注成人影视资源检索与管理的开源工具,主要提供两大核心功能:
**AvHub** 是一款专注成人影视资源检索与管理的开源工具,主要提供三大核心功能:
1. **AV番号搜索引擎**
通过日本成人影片编号(番号)实现精准磁力链接检索,支持多平台资源聚合查询
---
2. **月度里番合集**
自动整理当月成人动画作品,形成系统化的发行日历与资源归档
### **核心特性**
● 🔗 **番号磁力链搜索**:精准查找番号对应的磁力链接。
● 📅 **里番资源定时内容更新追踪**:自动更新并归档月度里番资源。
● 📊 **随机视频推荐**:基于爬虫数据的随机播放功能。
● 🌐 **多语言支持**:支持多种语言界面,满足全球用户需求。
● 🎨 **多种主题配色切换**:提供多种主题配色,提升用户体验。
**核心特性**
● 🔍 跨平台搜索优化引擎
● 📅 定时内容更新追踪机制
● 🔗 磁链验证系统(健康度/存活率检测)
● 📊 社区驱动的数据库更新模式
---
基于现代Web技术开发项目严格遵循隐私保护原则不直接托管任何资源文件。我们欢迎开发者共同完善智能推荐算法和内容审核系统持续推进工具的技术合规性建设。
### **技术栈**
**前端**
• 使用 **Tailwind CSS** 构建现代化、响应式界面。
• 集成 **hls.js** 实现流畅的视频播放体验。
**后端**
• 基于 **Python****FastAPI** 框架开发,提供高效、稳定的 API 服务。
**隐私保护**
严格遵循隐私保护原则,不直接托管任何资源文件,所有数据均通过第三方链接获取。
*法律声明:用户需自行遵守所在地区相关法律法规*
---
### **法律声明**
用户需自行遵守所在地区相关法律法规。AvHub 仅为资源检索工具,不涉及任何资源的分发与存储。
---
**AvHub** 基于现代 Web 技术开发,致力于为用户提供高效、便捷的成人影视资源管理体验。
### **License**
This project is provided under a **Apache License 2.0** license that can be found in the [LICENSE](LICENSE) file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license.

26
data/config.yaml Normal file
View File

@ -0,0 +1,26 @@
# config.yaml
app:
cors_origins: ["*"]
cors_credentials: true
cors_methods: ["*"]
cors_headers: ["*"]
files:
hacg_json_path: "/app/data/hacg.json"
video_urls_txt_path: "/app/data/video_urls.txt"
av_spider:
source_url: "https://missav.ai/cn/search/"
proxy_url: "http://192.168.50.2:7890" # http or socks5 proxy
hacg_spider:
hacg_url: ""
logging:
log_file: "main.log"
level: "INFO"
hydra:
run:
dir: "."
output_subdir: null

73
data/hacg.json Normal file
View File

@ -0,0 +1,73 @@
{
"2025年01月合集": "magnet:?xt=urn:btih:2146ffbecbed4d77b8b5ca6665b5c9e9949187ff",
"2024年12月合集": "magnet:?xt=urn:btih:827da30f3f6fc851f6c97758bb0618f755de14df",
"2024年11月合集": "magnet:?xt=urn:btih:080528a3185d9717084612541651c99dd541f9be",
"2024年10月合集": "magnet:?xt=urn:btih:309fa89ea1f05aa61d19e2faa1c23ebf3b30bd85",
"2024年09月合集": "magnet:?xt=urn:btih:e4fa8fb91a67f6b13efbbfe32f63d9b53bedd59e",
"2024年08月合集": "magnet:?xt=urn:btih:8ee70c99288a03f091c7bfdccf0954aa403bb150",
"2024年07月合集": "magnet:?xt=urn:btih:fb1c366e05af1a9eca23aee3810793b4212d4497",
"2024年06月合集": "magnet:?xt=urn:btih:423a7e4677193d3ce71de16152e341001e990352",
"2024年05月合集": "magnet:?xt=urn:btih:453bb0d3a5dcf082fb9b46c5cdc11d766de49406",
"2024年04月合集": "magnet:?xt=urn:btih:b496ccad674d6aa8c180e317747627ac01ecb23a",
"[桜都字幕组] 2020年7月合集": "magnet:?xt=urn:btih:03023f5686546918b90d0e0fec05948988594bf4",
"2022年09月合集": "magnet:?xt=urn:btih:a2d07b599126ef7e16d5d354ff1d47b414ff36dd",
"2024年03月合集": "magnet:?xt=urn:btih:3ca3e98867bc5571b21a76bb8f76b13279f5c25a",
"[桜都字幕组]2019年4月合集": "magnet:?xt=urn:btih:e7eca92cb2eead0abdddd57a691ee7c1bfd8aec1",
"[Animaker] To Love 系列3D同人动画7、8、9月合集": "magnet:?xt=urn:btih:5672503e4c76c154df21cdf2608a8bc33f25f3ba",
"[桜都字幕组] 2020年04月合集": "magnet:?xt=urn:btih:f574aca0560e54251559c10b716517958a10b40d",
"[鹰小队翻译组]2022年7月合集": "magnet:?xt=urn:btih:f1e9ba7e81da36a596da6da40708cce7e55f7985",
"2024年02月合集": "magnet:?xt=urn:btih:fe4febd0581492f01c583ca2201c9af7ef697365",
"2024年01月合集": "magnet:?xt=urn:btih:14a6d8231d9a6095fd2f53e05d47c319643201af",
"2023年12月合集": "magnet:?xt=urn:btih:3158bc6a3ebf4b33fe4c2e8a64c7d0c6c42a0acf",
"2023年11月合集": "magnet:?xt=urn:btih:80929ba4ef85abcd64c4d34baf4999cf1b32f61f",
"2023年10月合集": "magnet:?xt=urn:btih:160b29b4227e15352774c7b5152b0a6934f95d55",
"2023年09月合集": "magnet:?xt=urn:btih:21e37638bcc4931fa8daa1baeb27cfe82bd6f5f7",
"2023年08月合集": "magnet:?xt=urn:btih:2954d70aac32977f6b2869284d1489bb2f3eafea",
"2023年07月合集": "magnet:?xt=urn:btih:504e65916c9332e8d96bcdbfb9eef45f64575005",
"2023年06月合集": "magnet:?xt=urn:btih:946d0f92451d6aba8b14ef87b2a1987a9fda47a1",
"2023年05月合集": "magnet:?xt=urn:btih:f6a0102fa3d4f9e0ebe1f63069d2191173d2281d",
"2023年04月合集": "magnet:?xt=urn:btih:57b5b9525da08e988c8c69e733d913ffd6241918",
"2023年03月合集": "magnet:?xt=urn:btih:c709813729f3b9102364d9301fadc42c223a6bef",
"[b8er4u] AI生成 2023年2月合集": "magnet:?xt=urn:btih:6bba028b2f4e93b6c4d5eb9056c7396ea6bff8e5",
"2023年01月合集": "magnet:?xt=urn:btih:434da3bff16336c146ad3cdae00680527e7ceec9",
"2022年12月合集": "magnet:?xt=urn:btih:e701b02f8e0df25c7fff7eb6e1732d63c6e63791",
"2022年10月合集": "magnet:?xt=urn:btih:a6b30fc2291ef826533dafc720d9a23209b865d6",
"[鹰] 2022年8月合集": "magnet:?xt=urn:btih:4ee88182ae3e1964cd51386ee002bcf71be2cf2c",
"[鹰小队翻译组]2022年6月合集[1080P]": "magnet:?xt=urn:btih:fb2d3b4b83c8f3f4e3d2a5b7027589181760b4e5",
"[次元字幕组] 2022年5月合集": "magnet:?xt=urn:btih:f3e5801dd1876e462395567a14ecd14e22ecdbde",
"[桜都字幕组] 2021年12月合集": "magnet:?xt=urn:btih:d5a7231fcc570a9cd8bb936e94ee46b5e770b3ad",
"2021年9月合集": "magnet:?xt=urn:btih:1f6eb3787fae3a07a3c9762b75c48711778b48df",
"2021年8月合集": "magnet:?xt=urn:btih:e699f7ffc01ca00c385e7a1e1d89bc1f895a691a",
"[Taka.Sub] 2021年6月合集": "magnet:?xt=urn:btih:2f66c7bc52d5a1c3cd8b6a3717012765a42644f6",
"[鹰/Taka.Sub] 2021年5月合集": "magnet:?xt=urn:btih:9f700db0e80b9ead35381d3a11bd7cd992623d93",
"[鹰] 2021年4月合集": "magnet:?xt=urn:btih:61741e8e3962579fc9daa2ce6338d1eb6bf02d6e",
"[鹰] 2021年3月合集": "magnet:?xt=urn:btih:5dac7d69eef3dd3f975eba9492837bc93bf139bb",
"[桜都字幕组]2021年02月合集": "magnet:?xt=urn:btih:b7cc06d92fcb28f600525f6ebba291dc15e57d98",
"[鹰组/Taka.Sub]2021年2月合集": "magnet:?xt=urn:btih:4da945b18057009f6b8b6a053817e33aa3bfee55",
"[鹰组/Taka.Sub] 2021年1月合集": "magnet:?xt=urn:btih:36e8be5e0e9b6bc99bbcf6bccaa9ea545288b2d5",
"[桜都字幕组] 2020年12月合集": "magnet:?xt=urn:btih:a489ec8d90bb38c860b0d213f80dceb5ca64b0ba",
"[桜都字幕组] 2020年11月合集": "magnet:?xt=urn:btih:9424b9b7d721cb01a21a9f13547896d143ddd1d9",
"[桜都字幕组] 2020年10月合集": "magnet:?xt=urn:btih:7235d5b8497b98ca33caf2004dc6e9935cc47b5a",
"[桜都字幕组] 2020年9月合集": "magnet:?xt=urn:btih:669ed492944d5591a9e396a5a483940742440714",
"[桜都字幕组] 2020年8月合集": "magnet:?xt=urn:btih:404a3c7b4ee8b52bf8a8c00103c9545b303055b7",
"[桜都字幕组] 2020年6月合集": "magnet:?xt=urn:btih:60b6333809cc77aac3e181f4a2a2822f8b0f510b",
"[桜都字幕组]2020年05月合集": "magnet:?xt=urn:btih:9746006fda6b602cb5ce1ff728793d9635ce5e40",
"[桜都字幕组] 2020年03月合集": "magnet:?xt=urn:btih:7fce086c3db8e0f41ece40dc92e3f4c3aa16b535",
"[桜都字幕组]2020年2月合集": "magnet:?xt=urn:btih:91fb97619ef887d439a2142a2f9530b080cfbfd0",
"[桜都字幕组]2020年01月合集": "magnet:?xt=urn:btih:640b258ba41031ad7b2fa54bff1b4cad020a13a7",
"[桜都字幕组] 2019年12月合集": "magnet:?xt=urn:btih:058072f2fb052957245d47809a74d8cf8d737eda",
"[桜都字幕组] 2019年11月合集": "magnet:?xt=urn:btih:ad9178bb24f9863399b97d5d188b5dd51ddd36e4",
"[桜都字幕组] 2019年10月合集": "magnet:?xt=urn:btih:cbe5da5383cb7b99bb0707dd820163d3357d7ccc",
"[桜都字幕组] 2019年9月合集": "magnet:?xt=urn:btih:14ee20500e5c96d0ca0b2e6e576a282d5e80107a",
"[桜都字幕组] 2019年8月合集": "magnet:?xt=urn:btih:925aaeac1ae5b5937e09193124cefed719b4cf6b",
"[桜都字幕组] 2019年7月合集": "magnet:?xt=urn:btih:d109cb9631e5f0c7a6556bcb45b3b0b6b9abfd65",
"[桜都字幕组] 2019年6月合集": "magnet:?xt=urn:btih:2c5c7a75177046d36695252e380b233a3764699c",
"[桜都字幕组] 2019年5月合集": "magnet:?xt=urn:btih:53f0d34186883b7752dfd407703cd0b5a4ead203",
"2019年2月合集动画合集": "magnet:?xt=urn:btih:672551a849b3f78946149b208eaf4a3fb57413f1",
"[Haretahoo.sub] 2019年1月合集": "magnet:?xt=urn:btih:cda3841265b9861480845e3484ac269764dbb803",
"[桜都字幕组] 2018年10月合集": "magnet:?xt=urn:btih:7275d4206183911184d36e9c077483fb604ae0d4",
"[桜都字幕组] 2018年9月合集": "magnet:?xt=urn:btih:9c9e63ad2b861f83c494ebfb2e3aaa099280219d",
"[桜都字幕组] 2018年8月合集": "magnet:?xt=urn:btih:5cfb88b252ccbb6225698ec8257a6a5a90b79892",
"[脸肿字幕组]2018年07月合集标准版": "magnet:?xt=urn:btih:b7f466aca198f5a16a58259fa3920caed16e034b",
"[脸肿字幕组/Haretahoo.sub] 2018年06月合集赠品版": "magnet:?xt=urn:btih:ce66bb0c55bdffa3930652c565739ddb68044d1f"
}

10
data/video_urls.txt Normal file
View File

@ -0,0 +1,10 @@
https://videos1.bysshxd.com/20230726/Pg0dW2NlmPpg1/index.m3u8
https://videos1.bysshxd.com/20230728/NJrY0K4386kRo/index.m3u8
https://videos1.bysshxd.com/20230727/MJQA2oYdmVvJj/index.m3u8
https://videos1.bysshxd.com/20230728/PRx1jzovobvJb/index.m3u8
https://videos1.bysshxd.com/20230728/LJB97V6r9PnRy/index.m3u8
https://videos1.bysshxd.com/20230728/QR1kwvNwlM7gB/index.m3u8
https://videos1.bysshxd.com/20230726/qGWQNZYPWBEJl/index.m3u8
https://videos1.bysshxd.com/20230728/Qg8zwYbrwaVRw/index.m3u8
https://videos1.bysshxd.com/20230728/1GwbVQ52xaLgr/index.m3u8
https://videos1.bysshxd.com/20230802/5G2kw3KpaY2R2/index.m3u8

137
main.py Normal file
View File

@ -0,0 +1,137 @@
import sys
import argparse
import os
import re
import subprocess
import requests
import json
import ast
from bs4 import BeautifulSoup
from typing import Union
from fastapi import FastAPI
from fastapi.responses import JSONResponse
from fastapi.middleware.cors import CORSMiddleware
from fastapi import FastAPI, HTTPException
import random
from utils.spider import *
import hydra
from utils.logger import setup_logger
@hydra.main(config_path='data/', config_name='config', version_base=None)
def main(cfg: DictConfig):
# 初始化日志记录器
logger = setup_logger(cfg)
app = FastAPI()
@app.on_event("startup")
async def startup_event():
global logger
logger = setup_logger(cfg)
app.add_middleware(
CORSMiddleware,
allow_origins=cfg.app.cors_origins,
allow_credentials=cfg.app.cors_credentials,
allow_methods=cfg.app.cors_methods,
allow_headers=cfg.app.cors_headers,
)
def get_image_url(video_url: str) -> str:
try:
# 构建图片目录URL
image_dir_url = video_url.replace('index.m3u8', 'image/')
# 发送请求获取目录内容
response = requests.get(image_dir_url, timeout=20) # 设置超时时间防止长时间等待
response.raise_for_status() # 如果响应状态码不是200抛出HTTPError
# 解析HTML并提取链接
soup = BeautifulSoup(response.text, 'html.parser')
a_tags = soup.find_all('a', href=True) # 只查找有href属性的<a>标签
# 分离出.webp和其他格式链接并排除上级目录链接
links = [image_dir_url + tag['href'] for tag in a_tags if tag['href'] != '../']
webp_links = [link for link in links if link.endswith('.webp')]
# 优先返回.webp链接如果没有则从其他链接中随机返回
if not links:
logger.warning("No image links found.")
return None
return random.choice(webp_links or links)
except Exception as e:
logger.error(f"获取图片URL失败: {str(e)}")
return None
def read_random_line(file_path: str) -> tuple[str, str]:
"""Reads a random line from a given file and returns video URL and image URL."""
if not os.path.isfile(file_path):
logger.error("File not found")
raise HTTPException(status_code=404, detail="File not found")
with open(file_path, 'r') as file:
lines = file.readlines()
if not lines:
logger.error("File is empty")
raise HTTPException(status_code=400, detail="File is empty")
random_line = random.choice(lines).strip()
img_url = get_image_url(random_line)
return random_line, img_url
@app.get("/v1/hacg")
async def read_hacg():
try:
with open(cfg.files.hacg_json_path, 'r', encoding='utf-8') as file:
data = json.load(file)
logger.info("HACG data fetched successfully")
return JSONResponse({"data": data}, headers={'content-type': 'application/json;charset=utf-8'})
except Exception as e:
logger.error(f"Failed to fetch HACG data: {str(e)}")
raise HTTPException(status_code=500, detail="Internal Server Error")
@app.get("/v1/avcode/{code_str}")
async def crawl_av(code_str: str):
crawler = AVSpider(av_code=code_str,
source_url=cfg.av_spider.source_url,
proxy_url=cfg.av_spider.proxy_url,
cfg=cfg)
video_links = crawler.get_video_url()
all_magnet_links = []
for link in video_links:
magnet_links = crawler.get_magnet_links(link)
all_magnet_links.extend(magnet_links)
if not all_magnet_links:
logger.error("No magnet links found for AV code: %s", code_str)
raise HTTPException(status_code=404, detail="No magnet links found")
logger.info("Magnet links found for AV code: %s", code_str)
return {"status": "succeed", "data": [str(item) for item in all_magnet_links]}
@app.get("/v1/get_video")
async def get_random_video_url():
"""Returns a random video URL and its corresponding image URL."""
try:
file_path = cfg.files.video_urls_txt_path
video_url, img_url = read_random_line(file_path)
logger.info("Random video URL and image URL fetched successfully")
return {
"url": video_url,
"img_url": img_url or "" # 如果没有找到图片,使用默认图片
}
except Exception as e:
logger.error(f"Failed to fetch random video URL: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
if __name__ == "__main__":
main()

17
nginx.example.conf Normal file
View File

@ -0,0 +1,17 @@
server {
listen 80;
listen [::]:80;
server_name _;
index index.php index.html index.htm default.php default.htm default.html;
root /app;
location /api/ {
proxy_pass http://127.0.0.1:8000/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
}
}

0
utils/__init__.py Normal file
View File

93
utils/hacg_spider.py Normal file
View File

@ -0,0 +1,93 @@
import requests
from bs4 import BeautifulSoup
import re
import json
import os
class HACGScraper:
def __init__(self, url, filepath):
self.url = url
self.filepath = filepath
def get_pages(self):
response = requests.get(self.url)
html_content = response.text
soup = BeautifulSoup(html_content, 'html.parser')
div_ele = soup.find('div', class_='wp-pagenavi')
page_text = div_ele.get_text() if div_ele else ''
pages = None
if "" in page_text:
pages = int(page_text.split('')[1].split('')[0])
return pages
def get_links(self, page):
url = f'{self.url}'
response = requests.get(url)
html_content = response.text
soup = BeautifulSoup(html_content, 'html.parser')
links = {}
for a_tag in soup.find_all('a'):
href = a_tag.get('href')
text = a_tag.get_text(strip=True)
if "月合集" in text:
links[text] = href
magnet_links = {}
for title, link in links.items():
response = requests.get(link)
if response.status_code == 200:
content = response.text
matches = re.findall(r'\b[a-f0-9]{40}\b', content)
if matches:
magnet_links[title] = f'magnet:?xt=urn:btih:{matches[0]}'
else:
print(f"请求失败,状态码: {response.status_code}")
return magnet_links
def update_json_file(self):
if not os.path.exists(self.filepath) or os.path.getsize(self.filepath) == 0:
results = {}
total_pages = self.get_pages()
for i in range(1, total_pages + 1):
new_data = self.get_links(i)
results.update(new_data)
print(f'Page {i} processed (Full Update)')
else:
with open(self.filepath, 'r', encoding='utf-8') as file:
results = json.load(file)
total_pages = self.get_pages()
for i in range(1, total_pages + 1):
new_data = self.get_links(i)
all_exists = True
for title, magnet_link in new_data.items():
if title not in results or results[title] != magnet_link:
all_exists = False
break
if not all_exists:
results = {**new_data, **results}
print(f'Page {i} processed (Incremental Update)')
if all_exists:
print(f"{i} 页数据已存在于 JSON 文件中,停止更新")
break
with open(self.filepath, 'w', encoding='utf-8') as file:
json.dump(results, file, ensure_ascii=False, indent=4)
print("JSON文件已更新")
# 使用示例
scraper = HACGScraper(url='https://www.hacg.mov/wp/page/1?s=%E5%90%88%E9%9B%86&submit=%E6%90%9C%E7%B4%A2', filepath=r"C:\Users\levywang\OneDrive\Code\avhub_v2\data\hacg.json")
scraper.update_json_file()

25
utils/logger.py Normal file
View File

@ -0,0 +1,25 @@
import logging
from omegaconf import DictConfig
def setup_logger(cfg: DictConfig):
logger = logging.getLogger(__name__)
if not logger.hasHandlers(): # 检查是否已经有处理器
logger.setLevel(getattr(logging, cfg.logging.level.upper()))
# 创建文件处理器和流处理器
file_handler = logging.FileHandler(cfg.logging.log_file)
stream_handler = logging.StreamHandler()
# 设置日志格式
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
file_handler.setFormatter(formatter)
stream_handler.setFormatter(formatter)
# 添加处理器到日志器
logger.addHandler(file_handler)
logger.addHandler(stream_handler)
return logger

213
utils/spider.py Normal file
View File

@ -0,0 +1,213 @@
import re
import json
import os
from bs4 import BeautifulSoup
from curl_cffi import requests
from omegaconf import DictConfig
from utils.logger import setup_logger
class AVSpider:
def __init__(self, av_code, source_url, proxy_url, cfg: DictConfig):
self.source_url = source_url
self.av_code = av_code.lower()
self.proxy_url = proxy_url
self.headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36',
'Content-Type': 'application/json'
}
self.proxies = {
"http": self.proxy_url,
"https": self.proxy_url
}
self.logger = setup_logger(cfg)
def get_video_url(self) -> list:
"""
获取视频页面的链接
:return: 包含视频页面链接的列表
"""
code_str = self.av_code.replace('-', '')
match = re.match(r'([a-zA-Z]+)(\d+)', code_str)
if not match:
self.logger.error(f"Invalid AV code format: {self.av_code}")
return []
letters, digits = match.groups()
code_str = f"{letters.lower()}-{digits}"
url = f"{self.source_url}{code_str}"
try:
response = requests.get(url, proxies=self.proxies, headers=self.headers)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
return []
html_content = response.text
soup = BeautifulSoup(html_content, 'html.parser')
unique_links = set()
for a_tag in soup.find_all('a'):
alt_text = a_tag.get('alt')
if alt_text and code_str in alt_text:
href = a_tag.get('href')
if href:
unique_links.add(href)
self.logger.info(f"Found video URLs: {unique_links}")
return list(unique_links)
def get_magnet_links(self, link: str) -> list:
"""
从视频页面中提取磁力链接
:param link: 视频页面的 URL
:return: 包含磁力链接的列表
"""
try:
response = requests.get(link, proxies=self.proxies, headers=self.headers)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
return []
html_content = response.text
soup = BeautifulSoup(html_content, 'html.parser')
target_table = soup.find('table', class_='min-w-full')
result = []
if target_table is not None:
rows = target_table.find_all('tr')
for row in rows:
cols = row.find_all('td')
data = []
for col in cols:
links = col.find_all('a', rel='nofollow')
if links:
for l in links:
href = l['href']
if "keepshare.org" not in href:
data.append(href)
text = col.get_text(strip=True)
if text != "下载" and "keepshare.org" not in text:
data.append(text)
result.append(data)
self.logger.info(f"Magnet links extracted from {link}")
return result
class HacgSpider:
def __init__(self, url, filepath, cfg: DictConfig):
self.url = url
self.filepath = filepath
self.logger = setup_logger(cfg)
def get_pages(self):
try:
response = requests.get(self.url)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
return None
html_content = response.text
soup = BeautifulSoup(html_content, 'html.parser')
div_ele = soup.find('div', class_='wp-pagenavi')
page_text = div_ele.get_text() if div_ele else ''
pages = None
if "" in page_text:
pages = int(page_text.split('')[1].split('')[0])
self.logger.info(f"Total pages found: {pages}")
return pages
def get_links(self, page):
url = f'{self.url}?page={page}&s=%E5%90%88%E9%9B%86&submit=%E6%90%9C%E7%B4%A2'
try:
response = requests.get(url)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
return {}
html_content = response.text
soup = BeautifulSoup(html_content, 'html.parser')
links = {}
for a_tag in soup.find_all('a'):
href = a_tag.get('href')
text = a_tag.get_text(strip=True)
if "月合集" in text:
links[text] = href
magnet_links = {}
for title, link in links.items():
try:
response = requests.get(link)
response.raise_for_status()
except requests.RequestException as e:
self.logger.error(f"请求失败: {e}")
continue
content = response.text
matches = re.findall(r'\b[a-f0-9]{40}\b', content)
if matches:
magnet_links[title] = f'magnet:?xt=urn:btih:{matches[0]}'
self.logger.info(f"Magnet links extracted from page {page}: {magnet_links}")
return magnet_links
def update_json_file(self):
if not os.path.exists(self.filepath) or os.path.getsize(self.filepath) == 0:
results = {}
total_pages = self.get_pages()
if total_pages is None:
self.logger.error("无法获取总页数")
return
for i in range(1, total_pages + 1):
new_data = self.get_links(i)
results.update(new_data)
self.logger.info(f'Page {i} processed (Full Update)')
else:
with open(self.filepath, 'r', encoding='utf-8') as file:
results = json.load(file)
total_pages = self.get_pages()
if total_pages is None:
self.logger.error("无法获取总页数")
return
for i in range(1, total_pages + 1):
new_data = self.get_links(i)
all_exists = True
for title, magnet_link in new_data.items():
if title not in results or results[title] != magnet_link:
all_exists = False
break
if not all_exists:
results = {**new_data, **results}
self.logger.info(f'Page {i} processed (Incremental Update)')
if all_exists:
self.logger.info(f"{i} 页数据已存在于 JSON 文件中,停止更新")
break
with open(self.filepath, 'w', encoding='utf-8') as file:
json.dump(results, file, ensure_ascii=False, indent=4)
self.logger.info("JSON文件已更新")

185
web/index.html Normal file
View File

@ -0,0 +1,185 @@
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AvHub - 成人资源管理平台</title>
<link href="https://testingcf.jsdelivr.net/npm/tailwindcss@2.2.19/dist/tailwind.min.css" rel="stylesheet">
<link href="style.css" rel="stylesheet">
<link href="https://testingcf.jsdelivr.net/npm/hls.js@1.4.12/dist/hls.min.js" rel="stylesheet">
</head>
<body class="bg-gradient-to-br from-slate-900 to-slate-800 min-h-screen text-gray-100 transition-colors duration-300">
<!-- 在 body 标签下添加 logo -->
<div class="logo-container">
<a href="/" class="logo">
<span class="logo-av">Av</span><span class="logo-hub">Hub</span>
</a>
</div>
<!-- 修改设置区域 -->
<div class="settings-container">
<!-- 语言切换按钮 -->
<button id="languageToggle" class="settings-button">
<svg class="w-5 h-5" fill="currentColor" viewBox="0 0 24 24">
<path d="M12.87 15.07l-2.54-2.51.03-.03A17.52 17.52 0 0014.07 6H17V4h-7V2H8v2H1v2h11.17C11.5 7.92 10.44 9.75 9 11.35 8.07 10.32 7.3 9.19 6.69 8h-2c.73 1.63 1.73 3.17 2.98 4.56l-5.09 5.02L4 19l5-5 3.11 3.11.76-2.04zM18.5 10h-2L12 22h2l1.12-3h4.75L21 22h2l-4.5-12zm-2.62 7l1.62-4.33L19.12 17h-3.24z"/>
</svg>
</button>
<!-- 主题切换按钮 -->
<button id="themeToggle" class="settings-button theme-toggle">
<svg class="w-5 h-5" fill="currentColor" viewBox="0 0 20 20">
<path d="M17.293 13.293A8 8 0 016.707 2.707a8.001 8.001 0 1010.586 10.586z"/>
</svg>
</button>
</div>
<!-- 主要内容区域 -->
<main class="container mx-auto px-4 pt-20 pb-8">
<!-- 标签页导航 -->
<div class="flex justify-center mb-8">
<div class="tab-container">
<button onclick="switchTab('search')"
class="tab-button active"
data-tab="search">
<span class="tab-text" data-zh="AV搜索" data-en="AV Search">AV搜索</span>
</button>
<button onclick="switchTab('collections')"
class="tab-button"
data-tab="collections">
<span class="tab-text" data-zh="里番合集" data-en="Anime Collection">里番合集</span>
</button>
<button onclick="switchTab('player')"
class="tab-button"
data-tab="player">
<span class="tab-text" data-zh="视频播放" data-en="Video Player">视频播放</span>
</button>
</div>
</div>
<!-- AV搜索区域 -->
<div id="searchTab" class="tab-content">
<!-- 搜索框 -->
<div class="max-w-2xl mx-auto mb-8">
<div class="flex gap-2">
<input type="text" id="searchInput"
class="flex-1 px-4 py-2"
data-zh-placeholder="请输入AV番号..."
data-en-placeholder="Enter AV number..."
placeholder="请输入AV番号...">
<button onclick="searchMagnet()"
class="search-button px-6 py-2 min-w-[100px]">
<span class="tab-text" data-zh="搜索" data-en="Search">搜索</span>
</button>
</div>
</div>
<!-- 封面图片区域 -->
<div id="coverImageContainer" class="cover-image-container hidden">
<img id="coverImage" class="cover-image" src="" alt="封面图片">
</div>
<!-- 大图预览模态框 -->
<div id="imageModal" class="image-modal hidden">
<div class="modal-content">
<img id="modalImage" class="modal-image" src="" alt="大图预览">
<button id="closeModal" class="modal-close">
<svg class="w-6 h-6" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M6 18L18 6M6 6l12 12"></path>
</svg>
</button>
</div>
</div>
<!-- 排序选项 -->
<div class="max-w-4xl mx-auto mb-4 flex justify-end">
<button id="sortButton" class="settings-button theme-toggle" onclick="showSortMenu(this)" value="date-desc">
<svg class="w-4 h-4" fill="currentColor" viewBox="0 0 20 20">
<path d="M3 3a1 1 0 000 2h11a1 1 0 100-2H3zM3 7a1 1 0 000 2h7a1 1 0 100-2H3zM3 11a1 1 0 100 2h4a1 1 0 100-2H3z"/>
</svg>
<span class="ml-2">最新日期</span>
</button>
</div>
<!-- 搜索结果 -->
<div id="searchResults" class="max-w-4xl mx-auto space-y-3"></div>
</div>
<!-- 里番合集区域 -->
<div id="collectionsTab" class="tab-content hidden">
<div class="collection-description">
<span class="tab-text" data-zh="链接持续更新中..." data-en="Links are being updated...">链接持续更新中...</span>
</div>
<div id="collectionList" class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4 max-w-6xl mx-auto"></div>
</div>
<!-- 视频播放区域 -->
<div id="playerTab" class="tab-content hidden">
<div class="max-w-4xl mx-auto">
<div class="nsfw-warning mb-4 text-center">
<span class="tab-text" data-zh="⚠️ 警告:该内容包含成人内容 (NSFW)请确保您已年满18岁"
data-en="⚠️ Warning: This content contains adult material (NSFW), ensure you are 18+">
⚠️ 警告:该内容包含成人内容 (NSFW)请确保您已年满18岁
</span>
</div>
<div class="relative w-full">
<video id="videoPlayer" class="w-full h-auto" poster="" controls playsinline preload="metadata">
Your browser does not support the video tag.
</video>
</div>
<div class="video-controls flex justify-between items-center mt-4 mb-4">
<div class="autoplay-toggle flex items-center">
<input type="checkbox" id="autoplayToggle" class="hidden">
<label for="autoplayToggle" class="cursor-pointer flex items-center">
<span class="toggle-switch"></span>
<span class="ml-2 tab-text" data-zh="自动播放" data-en="Auto Play">自动播放</span>
</label>
</div>
<button id="nextVideo" class="next-button px-6 py-2">
<span class="tab-text" data-zh="下一个" data-en="Next">下一个</span>
</button>
</div>
<div class="video-source flex items-center bg-opacity-5 rounded-lg p-3 text-sm">
<div class="flex-grow overflow-hidden mr-2">
<div class="source-label mb-1 opacity-70">
<span class="tab-text" data-zh="视频源地址" data-en="Video Source URL">视频源地址</span>
</div>
<div class="source-url truncate font-mono" id="videoSourceUrl"></div>
</div>
<button id="copySourceUrl" class="copy-button flex items-center px-3 py-1.5 rounded">
<svg class="w-4 h-4 mr-1" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M8 5H6a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2v-1M8 5a2 2 0 002 2h2a2 2 0 002-2M8 5a2 2 0 012-2h2a2 2 0 012 2m0 0h2a2 2 0 012 2v3m2 4H10m0 0l3-3m-3 3l3 3"></path>
</svg>
<span class="tab-text" data-zh="复制" data-en="Copy">复制</span>
</button>
</div>
</div>
</div>
</main>
<!-- 回到顶部按钮 -->
<button id="backToTop" class="back-to-top hidden">
<svg class="w-6 h-6" fill="currentColor" viewBox="0 0 20 20">
<path fill-rule="evenodd" d="M14.707 12.707a1 1 0 01-1.414 0L10 9.414l-3.293 3.293a1 1 0 01-1.414-1.414l4-4a1 1 0 011.414 0l4 4a1 1 0 010 1.414z" clip-rule="evenodd"></path>
</svg>
</button>
<script src="https://testingcf.jsdelivr.net/npm/hls.js@1.4.12/dist/hls.min.js"></script>
<script src="script.js"></script>
<!-- 在body末尾添加通知元素 -->
<div class="notification" id="notification">
<svg fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M5 13l4 4L19 7"></path>
</svg>
<span>已复制到剪贴板</span>
</div>
<!-- 添加加载动画模板 -->
<template id="loadingTemplate">
<div class="loading-container">
<div class="loading-spinner"></div>
<div class="loading-text">
<span class="tab-text" data-zh="正在搜索中..." data-en="Searching...">正在搜索中...</span>
</div>
</div>
</template>
</body>
</html>

1507
web/script.js Normal file

File diff suppressed because it is too large Load Diff

1414
web/style.css Normal file

File diff suppressed because it is too large Load Diff