Add null pyworker for queue-driven autoscaling

A PyWorker that does not forward to any model server. POST /reserve holds
the worker busy until the client disconnects (or the duration cap elapses),
so users with their own job queue can drive Vast autoscaling without
exposing inbound model traffic on the instance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Rob Ballantyne
2026-05-11 16:48:52 +01:00
parent 9bc9ba11c5
commit 18974873e5
4 changed files with 254 additions and 0 deletions
View File