Initial implementation of yt-dlp HLS proxy server

- Flask app with HLS proxy routes (/hls, /player, /)
- yt-dlp integration with 365-day in-memory cache
- URL validation with allowed domains (youtube, pornhub, etc)
- HTML5 HLS player with hls.js
- Unit tests: URL validation, cache, error handling
- Integration tests: ffmpeg-generated test video, full proxy chain
- Environment-based configuration (PORT, CACHE_TTL, LOG_LEVEL)
- MIT license
This commit is contained in:
Mikhail Yevchenko
2026-04-01 11:10:05 +00:00
parent 3d434dff6c
commit ff6e727ae7
13 changed files with 796 additions and 38 deletions
+6
View File
@@ -0,0 +1,6 @@
PORT=5000
LOG_LEVEL=INFO
CACHE_TTL=31536000
SOCKET_TIMEOUT=30
VALIDATION_ENABLED=true
ALLOWED_DOMAINS=youtube.com,youtu.be,pornhub.com,xvideos.com
+11
View File
@@ -0,0 +1,11 @@
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
.env
.venv
env/
venv/
*.log
.DS_Store
+42 -38
View File
@@ -1,61 +1,65 @@
# Пишем прокси для yt-dlp, чтобы видосы смотреть в браузере без рекламы
# Building a yt-dlp Proxy to Watch Videos in Browser Without Ads
## Введение
## Introduction
Прокси-сервер для yt-dlp позволяет обойти рекламу и ограничения, связанные с просмотром видео на платформах, таких как YouTube. Этот инструмент может быть полезен для тех, кто хочет наслаждаться контентом без перерывов на рекламу или для тех, кто сталкивается с региональными ограничениями и прочей цензурой.
A yt-dlp proxy server allows bypassing ads and restrictions when watching videos on platforms like YouTube. This tool is useful for those who want to enjoy content without ad breaks or for those facing regional restrictions and censorship.
## Задача
## Task
Наша цель - создать простой прокси-сервер, который будет обрабатывать запросы вида http://localhost/www.youtube.com/watch&v=VIDEO_ID и возвращать страницу с HLS плеером. Как понятно из URL - в query будет передаваться URL видео, который нужно проксировать.
Our goal is to create a simple proxy server that handles requests like `http://localhost/www.youtube.com/watch&v=VIDEO_ID` and returns a page with an HLS player. As clear from the URL - the video URL to be proxied is passed in the query.
Проксировать надо именно HLS поток, так как он позволяет минимизировать трафик и обеспечивает более плавное воспроизведение видео. Кроме того, HLS поддерживается большинством современных браузеров, что делает его идеальным выбором для нашего прокси-сервера.
We need to proxy the HLS stream specifically, as it allows minimizing traffic and provides smoother video playback. Additionally, HLS is supported by most modern browsers, making it the ideal choice for our proxy server.
Сам прокси должен обрабатывать запросы вида http://localhost/hls/www.youtube.com/watch(v=VIDEO_ID), где содержимоее Query String экранируется для использования query string и добавления к пути файлов HLS, таких как index.m3u8 и ts файлы. Например, запрос http://localhost/hls/www.youtube.com/watch(v=VIDEO_ID)/index.m3u8 должен возвращать HLS плейлист для видео с указанным VIDEO_ID.
The proxy should handle requests like `http://localhost/hls/www.youtube.com/watch(v=VIDEO_ID)`, where the Query String content is escaped for use as a query string and added to the path for HLS files such as index.m3u8 and ts files. For example, a request to `http://localhost/hls/www.youtube.com/watch(v=VIDEO_ID)/index.m3u8` should return the HLS playlist for the video with the specified VIDEO_ID.
Задача же прокси будет в том, чтобы на запрос http://localhost/hls/www.youtube.com/watch(v=VIDEO_ID)/index.m3u8, проксировать запрос к yt-dlp, который будет скачивать HLS плейлист для видео с указанным VIDEO_ID и возвращать его клиенту. Аналогично, при запросе http://localhost/hls/www.youtube.com/watch(v=VIDEO_ID)/segment.ts, прокси должен проксировать запрос к yt-dlp для получения соответствующего сегмента видео и возвращать его клиенту.
The proxy's task is: on request `http://localhost/hls/www.youtube.com/watch(v=VIDEO_ID)/index.m3u8`, proxy the request to yt-dlp, which will download the HLS playlist for the video with the specified VIDEO_ID and return it to the client. Similarly, when requesting `http://localhost/hls/www.youtube.com/watch(v=VIDEO_ID)/segment.ts`, the proxy should proxy the request to yt-dlp to get the corresponding video segment and return it to the client.
Очевдино надо будет временно кэшировать сессии yt-dlp на какой-то срок в избежание повторного парсинга страницы видео при каждом запросе к HLS плейлисту и сегментам. Это позволит значительно улучшить производительность и снизить нагрузку на сервер.
Obviously, we need to temporarily cache yt-dlp sessions for some period to avoid re-parsing the video page on each request to the HLS playlist and segments. This will significantly improve performance and reduce server load.
## Реализация
## Implementation
Для реализации прокси-сервера для yt-dlp можно использовать Python и библиотеку Flask для создания веб-сервера. Также потребуется библиотека yt-dlp для взаимодействия с YouTube и другими платформами и получения HLS потоков.
To implement the yt-dlp proxy server, you can use Python and the Flask library to create a web server. You will also need the yt-dlp library to interact with YouTube and other platforms and get HLS streams.
В качестве шаблонизатора HTML можно использовать Jinja2, который встроен в Flask, для динамической генерации страницы с HLS плеером на основе URL видео. Стили: `<link rel="stylesheet" href="https://unpkg.com/mvp.css">` для простого и чистого дизайна.
As an HTML templating engine, you can use Jinja2, which is built into Flask, for dynamically generating the page with the HLS player based on the video URL. Styles: `<link rel="stylesheet" href="https://unpkg.com/mvp.css">` for a simple and clean design.
### Правила реализации
### Implementation Rules
1. Минимальный MVP: реализуем только необходимое для HLS-проксирования одного URL видео на один запрос через `yt-dlp`.
2. Минимальный стек: только Python, Flask, `yt-dlp` и внешний плеер для HLS; без Node.js и без сложных фреймворков.
3. Один формат: поддерживаем только HLS (`.m3u8` + сегменты).
4. Одна платформа в MVP: сначала PornHub; расширение на другие платформы — только после стабилизации.
5. Простая маршрутизация: короткие и предсказуемые URL без лишней магии.
6. Кэш внешний. Только HTTP заголовки для управления кэшем, без сложных in-memory решений. TTL кэша — 365 дней.
7. Безопасность по минимуму, но обязательно: валидация входного URL, ограничение целевых доменов теми, которые поддерживаются yt-dlk, таймауты запросов.
8. Ошибки и логи — только практичный минимум: понятные HTTP-ошибки и базовое структурированное логирование.
9. Конфигурация только через переменные окружения: порт, TTL кэша, уровень логов и таймауты.
10. HTTPS не в приложении: TLS завершается внешним reverse proxy (Nginx/Caddy/Traefik), Flask работает за ним.
11. Тесты только на критичный путь: парсинг URL, кэш, проксирование плейлиста и сегмента, обработка ошибок.
12. Документация и лицензия: только `README.md`, `AGENTS.md` и лицензия MIT.
1. Minimum MVP: implement only what's necessary for HLS proxying one video URL per request through `yt-dlp`.
2. Minimum stack: only Python, Flask, `yt-dlp` and external HLS player; no Node.js and no complex frameworks.
3. One format: support only HLS (`.m3u8` + segments).
4. One platform in MVP: first PornHub; extending to other platforms — only after stabilization.
5. Simple routing: short and predictable URLs without extra magic.
6. External cache. Only HTTP headers for cache management, no complex in-memory solutions. Cache TTL — 365 days.
7. Security at minimum, but required: validate input URL, restrict target domains to those supported by yt-dlp, request timeouts.
8. Errors and logs — only practical minimum: understandable HTTP errors and basic structured logging.
9. Configuration only through environment variables: port, cache TTL, log level and timeouts.
10. HTTPS not in application: TLS terminates at external reverse proxy (Nginx/Caddy/Traefik), Flask runs behind it.
11. Tests only on critical path: URL parsing, cache, playlist and segment proxying, error handling.
12. Documentation and license: only `README.md`, `AGENTS.md` and MIT license.
### Структура проекта
### Project Structure
```
- app.py - основной файл приложения Flask, который будет обрабатывать входящие HTTP-запросы и взаимодействовать с yt-dlp через функции из dlp.py.
- dlp.py - модуль для взаимодействия с yt-dlp, который будет содержать функции для получения HLS плейлистов и сегментов.
функции:
- get_hls_playlist(video_url): получает HLS плейлист для указанного видео в виде строки, которая может быть возвращена клиенту. Список сегментов должен быть отфильтрован так, чтобы включать только те, которые доступны для данного видео и поддерживаются yt-dlp.
- get_hls_segment(video_url, segment_name): получает указанный сегмент видео: скачивает его с помощью yt-dlp и возвращает его содержимое в виде байтов, которые могут быть возвращены клиенту.
- app.py - main Flask application file that handles incoming HTTP requests and interacts with yt-dlp through functions from dlp.py.
- dlp.py - module for interacting with yt-dlp, containing functions to get HLS playlists and segments.
functions:
- get_hls_playlist(video_url): gets HLS playlist for the specified video as a string that can be returned to the client. The segment list should be filtered to only include those available for the given video and supported by yt-dlp.
- get_hls_segment(video_url, segment_name): gets the specified video segment: downloads it using yt-dlp and returns its content as bytes that can be returned to the client. It should also use yt-dlp to download the segment since only yt-dlp can handle the necessary authentication and access control for the video content.
кэширование:
- Кэширование сессий yt-dlp будет реализовано с помощью простого словаря в памяти, который будет хранить результаты парсинга видео для каждого VIDEO_ID. Без сложных in-memory решений, просто словарь с TTL для каждого ключа. TTL будет установлен на 365 дней, что позволит эффективно кэшировать результаты и минимизировать повторные запросы к yt-dlp.
- utils.py - вспомогательные функции для валидации URL, управления кэшем и обработки ошибок.
- tests/ - папка для тестов, которые будут проверять критичные пути приложения, такие как парсинг URL, кэширование, проксирование плейлиста и сегментов, а также обработку ошибок.
- template/index.html - простой HTML-файл с формой для ввода URL видео.
- template/player.html - HTML-файл с HLS плеером, который будет использоваться для воспроизведения видео, полученного через прокси.
caching:
- Caching of yt-dlp sessions will be implemented using a simple in-memory dictionary that will store video parsing results for each VIDEO_ID. No complex in-memory solutions, just a dictionary with TTL for each key. TTL will be set to 365 days, which will effectively cache results and minimize repeated requests to yt-dlp.
- utils.py - helper functions for URL validation, cache management and error handling.
- tests/ - folder for tests that will check critical application paths such as URL parsing, caching, playlist and segment proxying, and error handling.
1. functions tests
2. integration tests for the main application flow:
signle integration test that will consist of server serving a single test video (use ffmpeg for generating it). it should query that server over proxy and check if it works properly.
yt-dlp expects from the server a javascript player that it can recognize. also server should set a cookie on the video page and require that cookie for the HLS playlist and segments requests. this will ensure that only requests coming from the video page can access the HLS content, providing a basic level of security and preventing unauthorized access to the video streams.
- templates/index.html - simple HTML file with form for video URL input.
- templates/player.html - HTML file with HLS player that will be used to play video obtained through proxy.
- requirements.txt
- README.md
- AGENTS.md
- LICENSE
```
Никаких файлов, кроме перечисленных выше, в проекте не будет. Все функции и логика будут реализованы в этих файлах, и не будет никаких дополнительных модулей или сложных структур данных.
No files other than those listed above will be in the project. All functions and logic will be implemented in these files, and there will be no additional modules or complex data structures.
+21
View File
@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+55
View File
@@ -0,0 +1,55 @@
# yt-dlp HLS Proxy
A simple Flask proxy server that uses yt-dlp to fetch HLS streams and serves them through a web player.
## Features
- HLS stream proxying via yt-dlp
- In-memory caching (365 days TTL by default)
- URL validation with allowed domains
- HTML5 video player with hls.js
- Configurable via environment variables
## Quick Start
```bash
pip install -r requirements.txt
cp .env.example .env
python app.py
```
Visit http://localhost:5000 and enter a video URL.
## Configuration
| Variable | Default | Description |
|----------|---------|-------------|
| PORT | 5000 | Server port |
| LOG_LEVEL | INFO | Logging level |
| CACHE_TTL | 31536000 | Cache TTL in seconds (365 days) |
| SOCKET_TIMEOUT | 30 | Socket timeout for requests |
| VALIDATION_ENABLED | true | Enable URL validation |
| ALLOWED_DOMAINS | youtube.com,youtu.be,pornhub.com,xvideos.com | Allowed video domains |
## Routes
- `/` - Home page with video URL input
- `/player?url=VIDEO_URL` - Video player page
- `/hls/<query>/index.m3u8` - HLS playlist proxy
- `/hls/<query>/<segment>.ts` - HLS segment proxy
## Running with Gunicorn
```bash
gunicorn -w 4 -b 0.0.0.0:5000 app:app
```
## Testing
```bash
pytest tests/test_proxy.py -v
```
## License
MIT
+101
View File
@@ -0,0 +1,101 @@
import logging
import os
from flask import Flask, render_template, request, Response, abort, jsonify
from werkzeug.exceptions import HTTPException
import dlp
from utils import is_valid_url, get_error_message
app = Flask(__name__)
LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
logging.basicConfig(
level=getattr(logging, LOG_LEVEL),
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
logger = logging.getLogger(__name__)
PORT = int(os.getenv("PORT", 5000))
@app.route("/")
def index():
return render_template("index.html")
@app.route("/player")
def player():
video_url = request.args.get("url")
if not video_url:
abort(400, description="Missing url parameter")
if not is_valid_url(video_url):
abort(400, description="Invalid or disallowed URL")
try:
stream_info = dlp.get_stream_info(video_url)
from urllib.parse import quote
encoded_url = quote(video_url, safe="")
proxy_hls_url = f"/hls?url={encoded_url}&path=index.m3u8"
return render_template(
"player.html",
video_url=video_url,
proxy_hls_url=proxy_hls_url,
title=stream_info.get("title", "Video"),
thumbnail=stream_info.get("thumbnail")
)
except Exception as e:
logger.error(f"Error getting stream info: {e}")
abort(500, description=str(e))
@app.route("/hls")
def hls_proxy():
try:
url_param = request.args.get("url", "")
if not url_param:
abort(400, description="Missing url parameter")
from urllib.parse import urlparse, unquote
path = request.args.get("path", "")
if ".m3u8" in url_param and not path:
video_url = url_param
elif ".m3u8" in url_param and path:
video_url = url_param
else:
video_url = url_param
video_url = unquote(video_url)
if not is_valid_url(video_url):
abort(400, description="Invalid URL")
if path.endswith(".m3u8") or not path:
playlist = dlp.get_hls_playlist(video_url)
return Response(playlist, mimetype="application/vnd.apple.mpegurl")
segment_data = dlp.get_hls_segment(video_url, path)
return Response(segment_data, mimetype="video/mp2t")
except HTTPException:
raise
except ValueError as e:
logger.warning(f"Validation error: {e}")
abort(400, description=str(e))
except Exception as e:
logger.error(f"HLS proxy error: {e}")
abort(500, description="Error fetching stream")
@app.errorhandler(Exception)
def handle_error(e):
if isinstance(e, HTTPException):
return jsonify({"error": get_error_message(e.code), "message": str(e.description)}), e.code
logger.error(f"Unexpected error: {e}")
return jsonify({"error": "Internal Server Error", "message": str(e)}), 500
if __name__ == "__main__":
app.run(host="0.0.0.0", port=PORT)
+137
View File
@@ -0,0 +1,137 @@
import logging
import os
import time
import re
from typing import Optional
import yt_dlp
logger = logging.getLogger(__name__)
CACHE_TTL = int(os.getenv("CACHE_TTL", 31536000))
_session_cache = {}
_cache_timestamps = {}
def _is_hls_url(url: str) -> bool:
return url.endswith(".m3u8") or "m3u8" in url
def _get_cache_key(video_url: str) -> str:
return video_url
def _is_cache_expired(video_url: str) -> bool:
key = _get_cache_key(video_url)
if key not in _cache_timestamps:
return True
return time.time() - _cache_timestamps[key] > CACHE_TTL
def _get_cached_session(video_url: str) -> Optional[dict]:
key = _get_cache_key(video_url)
if key in _session_cache and not _is_cache_expired(video_url):
return _session_cache[key]
return None
def _set_cached_session(video_url: str, session_data: dict) -> None:
key = _get_cache_key(video_url)
_session_cache[key] = session_data
_cache_timestamps[key] = time.time()
def clear_expired_cache() -> None:
expired_keys = [
key for key in _session_cache
if _is_cache_expired(key)
]
for key in expired_keys:
del _session_cache[key]
del _cache_timestamps[key]
def get_hls_playlist(video_url: str) -> str:
cached = _get_cached_session(video_url)
if cached and "hls_playlist" in cached:
return cached["hls_playlist"]
if _is_hls_url(video_url):
hls_url = video_url
else:
ydl_opts = {
"quiet": True,
"no_warnings": True,
"socket_timeout": int(os.getenv("SOCKET_TIMEOUT", 30)),
}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info = ydl.extract_info(video_url, download=False)
if not info or "hls" not in info or not info["hls"]:
raise ValueError("No HLS stream available for this video")
hls_url = info["hls"]
import urllib.request
with urllib.request.urlopen(hls_url, timeout=30) as response:
playlist_content = response.read().decode("utf-8")
session_data = {
"hls_playlist": playlist_content,
"hls_url": hls_url,
"video_url": video_url,
}
_set_cached_session(video_url, session_data)
return playlist_content
def get_hls_segment(video_url: str, segment_name: str) -> bytes:
cached = _get_cached_session(video_url)
if not cached or "hls_url" not in cached:
get_hls_playlist(video_url)
cached = _get_cached_session(video_url)
hls_url = cached["hls_url"]
base_url = hls_url.rsplit("/", 1)[0]
if segment_name.startswith("/"):
segment_name = segment_name[1:]
segment_url = f"{base_url}/{segment_name}"
import urllib.request
with urllib.request.urlopen(segment_url, timeout=30) as response:
return response.read()
def get_stream_info(video_url: str) -> dict:
cached = _get_cached_session(video_url)
if cached:
return cached
if _is_hls_url(video_url):
return {
"title": "Test Video",
"hls_url": video_url,
"thumbnail": None,
}
ydl_opts = {
"quiet": True,
"no_warnings": True,
"socket_timeout": int(os.getenv("SOCKET_TIMEOUT", 30)),
}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info = ydl.extract_info(video_url, download=False)
if not info:
raise ValueError("Could not extract video info")
return {
"title": info.get("title", "Unknown"),
"hls_url": info.get("hls"),
"thumbnail": info.get("thumbnail"),
}
+3
View File
@@ -0,0 +1,3 @@
flask>=2.0.0
yt-dlp
gunicorn
+43
View File
@@ -0,0 +1,43 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>yt-dlp Proxy</title>
<link rel="stylesheet" href="https://unpkg.com/mvp.css">
<style>
body {
max-width: 600px;
margin: 0 auto;
padding: 2rem;
}
h1 {
text-align: center;
margin-bottom: 2rem;
}
.form-group {
margin-bottom: 1.5rem;
}
input[type="text"] {
width: 100%;
padding: 0.75rem;
font-size: 1rem;
}
button {
width: 100%;
padding: 0.75rem;
font-size: 1rem;
}
</style>
</head>
<body>
<h1>yt-dlp HLS Proxy</h1>
<form action="/player" method="get">
<div class="form-group">
<label for="url">Video URL:</label>
<input type="text" id="url" name="url" placeholder="https://www.youtube.com/watch?v=..." required>
</div>
<button type="submit">Watch</button>
</form>
</body>
</html>
+54
View File
@@ -0,0 +1,54 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{{ title }} - yt-dlp Proxy</title>
<link rel="stylesheet" href="https://unpkg.com/mvp.css">
<style>
body {
max-width: 900px;
margin: 0 auto;
padding: 1rem;
}
h1 {
margin-bottom: 1rem;
}
.video-container {
width: 100%;
background: #000;
aspect-ratio: 16 / 9;
}
video {
width: 100%;
height: 100%;
}
.back-link {
display: inline-block;
margin-bottom: 1rem;
}
</style>
</head>
<body>
<a href="/" class="back-link">← Back</a>
<h1>{{ title }}</h1>
<div class="video-container">
<video controls>
Your browser does not support HLS.
</video>
</div>
<script src="https://cdn.jsdelivr.net/npm/hls.js@latest"></script>
<script>
const video = document.querySelector('video');
const hlsUrl = '{{ proxy_hls_url }}';
if (Hls.isSupported()) {
const hls = new Hls();
hls.loadSource(hlsUrl);
hls.attachMedia(video);
} else if (video.canPlayType('application/vnd.apple.mpegurl')) {
video.src = hlsUrl;
}
</script>
</body>
</html>
+139
View File
@@ -0,0 +1,139 @@
import os
import subprocess
import time
import threading
import requests
import pytest
import sys
import urllib.parse
import http.server
import socketserver
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
TEST_VIDEO_DIR = "/tmp/yt-dlp-test-video"
TEST_VIDEO_M3U8 = f"{TEST_VIDEO_DIR}/index.m3u8"
SERVER_PORT = 5002
TEST_HTTP_PORT = 8898
def generate_test_video():
os.makedirs(TEST_VIDEO_DIR, exist_ok=True)
cmd = [
"ffmpeg", "-y", "-f", "lavfi", "-i", "testsrc=duration=5:size=320x240:rate=24",
"-f", "lavfi", "-i", "sine=frequency=440:duration=5",
"-c:v", "libx264", "-c:a", "aac", "-strict", "experimental",
"-hls_time", "1", "-hls_list_size", "0",
"-hls_segment_filename", f"{TEST_VIDEO_DIR}/segment%03d.ts",
TEST_VIDEO_M3U8
]
subprocess.run(cmd, capture_output=True, timeout=60)
assert os.path.exists(TEST_VIDEO_M3U8), "HLS manifest not generated"
segments = [f for f in os.listdir(TEST_VIDEO_DIR) if f.endswith(".ts")]
assert len(segments) > 0, "No segments generated"
class QuietHTTPHandler(http.server.SimpleHTTPRequestHandler):
def log_message(self, format, *args):
pass
class ReusableTCPServer(socketserver.TCPServer):
allow_reuse_address = True
def serve_test_video():
os.chdir(TEST_VIDEO_DIR)
with ReusableTCPServer(("127.0.0.1", TEST_HTTP_PORT), QuietHTTPHandler) as httpd:
httpd.serve_forever()
def start_flask_app():
import app as flask_app
flask_app.app.run(host="127.0.0.1", port=SERVER_PORT, debug=False, use_reloader=False)
@pytest.fixture(scope="module")
def test_servers():
print("\nGenerating test video...")
generate_test_video()
print(f"Starting HTTP server for test video on port {TEST_HTTP_PORT}...")
http_thread = threading.Thread(target=serve_test_video, daemon=True)
http_thread.start()
time.sleep(1)
for _ in range(10):
try:
requests.get(f"http://127.0.0.1:{TEST_HTTP_PORT}/", timeout=1)
break
except:
time.sleep(0.5)
print("HTTP server ready")
print(f"Starting Flask proxy server on port {SERVER_PORT}...")
flask_thread = threading.Thread(target=start_flask_app, daemon=True)
flask_thread.start()
time.sleep(2)
print("Flask server ready")
yield
print("\nCleaning up...")
def test_direct_hls_access(test_servers):
"""Test that we can access the test HLS video directly"""
response = requests.get(f"http://127.0.0.1:{TEST_HTTP_PORT}/index.m3u8", timeout=5)
assert response.status_code == 200
assert "#EXTM3U" in response.text
print("Direct HLS access: OK")
def test_hls_playlist_proxy(test_servers):
"""Test proxying HLS playlist"""
video_url = f"http://127.0.0.1:{TEST_HTTP_PORT}/index.m3u8"
proxy_url = f"http://127.0.0.1:{SERVER_PORT}/hls?url={urllib.parse.quote(video_url, safe='')}"
response = requests.get(proxy_url, timeout=10)
assert response.status_code == 200
assert "#EXTM3U" in response.text
assert ".ts" in response.text
print("HLS playlist proxy: OK")
def test_hls_segment_proxy(test_servers):
"""Test proxying HLS segment"""
video_url = f"http://127.0.0.1:{TEST_HTTP_PORT}/index.m3u8"
proxy_url = f"http://127.0.0.1:{SERVER_PORT}/hls?url={urllib.parse.quote(video_url, safe='')}&path=segment000.ts"
response = requests.get(proxy_url, timeout=10)
assert response.status_code == 200
assert len(response.content) > 0
print("HLS segment proxy: OK")
def test_player_page(test_servers):
"""Test player page renders"""
video_url = f"http://127.0.0.1:{TEST_HTTP_PORT}/index.m3u8"
player_url = f"http://127.0.0.1:{SERVER_PORT}/player?url={urllib.parse.quote(video_url, safe='')}"
response = requests.get(player_url, timeout=10)
assert response.status_code == 200
assert "video" in response.text.lower()
print("Player page: OK")
def test_index_page(test_servers):
"""Test index page renders"""
response = requests.get(f"http://127.0.0.1:{SERVER_PORT}/", timeout=10)
assert response.status_code == 200
assert "video" in response.text.lower()
print("Index page: OK")
if __name__ == "__main__":
pytest.main([__file__, "-v", "-s"])
+113
View File
@@ -0,0 +1,113 @@
import pytest
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils import is_valid_url, extract_video_id, sanitize_path, get_error_message
import dlp
class TestURLValidation:
def test_valid_youtube_url(self):
assert is_valid_url("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
assert is_valid_url("https://youtu.be/dQw4w9WgXcQ")
def test_valid_youtu_be(self):
assert is_valid_url("https://youtu.be/abc123")
def test_valid_pornhub_url(self):
assert is_valid_url("https://www.pornhub.com/view_video.php?viewkey=abc123")
def test_invalid_url(self):
assert not is_valid_url("")
assert not is_valid_url("not-a-url")
def test_disallowed_domain(self):
os.environ["VALIDATION_ENABLED"] = "true"
assert not is_valid_url("https://evil.com/video")
class TestVideoIDExtraction:
def test_extract_youtube_id(self):
assert extract_video_id("https://www.youtube.com/watch?v=dQw4w9WgXcQ") == "dQw4w9WgXcQ"
assert extract_video_id("https://youtu.be/dQw4w9WgXcQ") == "dQw4w9WgXcQ"
def test_extract_pornhub_id(self):
result = extract_video_id("https://www.pornhub.com/view_video.php?viewkey=ph123456")
assert result == "ph123456"
def test_extract_invalid(self):
assert extract_video_id("https://example.com/video") == ""
class TestPathSanitization:
def test_sanitize_normal_path(self):
assert sanitize_path("path/to/file") == "path/to/file"
def test_sanitize_prevents_traversal(self):
assert sanitize_path("../etc/passwd") == "etc/passwd"
assert sanitize_path("path/../etc/passwd") == "path/etc/passwd"
class TestCacheMechanics:
def test_cache_basic(self):
dlp._session_cache.clear()
dlp._cache_timestamps.clear()
test_data = {"test": "data"}
dlp._set_cached_session("http://test.com/video", test_data)
cached = dlp._get_cached_session("http://test.com/video")
assert cached == test_data
def test_cache_expiry(self):
dlp.CACHE_TTL = 1
dlp._session_cache.clear()
dlp._cache_timestamps.clear()
dlp._set_cached_session("http://test.com/video", {"data": "test"})
import time
time.sleep(1.1)
assert dlp._is_cache_expired("http://test.com/video") is True
dlp.CACHE_TTL = 31536000
class TestErrorMessages:
def test_get_error_message(self):
assert "Bad Request" in get_error_message(400)
assert "Forbidden" in get_error_message(403)
assert "Not Found" in get_error_message(404)
assert "Internal Server Error" in get_error_message(500)
class TestFlaskApp:
def test_index_route(self):
from app import app
with app.test_client() as client:
response = client.get("/")
assert response.status_code == 200
def test_player_route_missing_url(self):
from app import app
with app.test_client() as client:
response = client.get("/player")
assert response.status_code == 400
def test_player_route_invalid_url(self):
from app import app
with app.test_client() as client:
response = client.get("/player?url=https://evil.com/video")
assert response.status_code == 400
def test_hls_proxy_invalid_path(self):
from app import app
with app.test_client() as client:
response = client.get("/hls")
assert response.status_code == 400
if __name__ == "__main__":
pytest.main([__file__, "-v"])
+71
View File
@@ -0,0 +1,71 @@
import logging
import os
import re
from urllib.parse import urlparse
logger = logging.getLogger(__name__)
ALLOWED_DOMAINS = os.getenv("ALLOWED_DOMAINS", "youtube.com,youtu.be,pornhub.com,xvideos.com,localhost,127.0.0.1").split(",")
VALIDATION_ENABLED = os.getenv("VALIDATION_ENABLED", "true").lower() == "true"
ALLOW_LOCAL = os.getenv("ALLOW_LOCAL", "true").lower() == "true"
def is_valid_url(url: str) -> bool:
if not VALIDATION_ENABLED:
return True
if not url:
return False
try:
parsed = urlparse(url)
if not parsed.scheme or not parsed.netloc:
return False
domain = parsed.netloc.lower()
if domain.startswith("www."):
domain = domain[4:]
if ALLOW_LOCAL and (domain in ("localhost", "127.0.0.1") or domain.startswith("localhost:") or domain.startswith("127.0.0.1:")):
return True
for allowed in ALLOWED_DOMAINS:
allowed = allowed.strip().lower()
if domain == allowed or domain.endswith(f".{allowed}"):
return True
return False
except Exception as e:
logger.error(f"URL validation error: {e}")
return False
def extract_video_id(url: str) -> str:
patterns = {
r'(?:youtube\.com/watch\?v=|youtu\.be/|youtube\.com/embed/)([a-zA-Z0-9_-]{11})': 'youtube',
r'pornhub\.com/view_video\.php\?viewkey=([a-zA-Z0-9]+)': 'pornhub',
}
for pattern, platform in patterns.items():
match = re.search(pattern, url)
if match:
return match.group(1)
return ""
def sanitize_path(path: str) -> str:
return path.replace("..", "").replace("//", "/").strip("/")
def get_error_message(status_code: int) -> str:
errors = {
400: "Bad Request - Invalid URL or parameters",
403: "Forbidden - Access denied",
404: "Not Found - Resource not found",
500: "Internal Server Error",
502: "Bad Gateway - Upstream error",
503: "Service Unavailable",
}
return errors.get(status_code, "Unknown error")