Files
pyworker/workers/null
Rob Ballantyne 913e3a8782 Simplify null pyworker code and docs
Pass over all three files to drop verbose expository commentary that
duplicated either the code or the README. Net: -284 lines.

README now reads top-to-bottom in roughly the order someone would need
the info: use case → how it works → endpoint params → API → healthcheck
→ deploy → demo. Endpoint params table uses the values actually tested
on alpha (min_load=0, target_util=1, max_queue_time=1,
target_queue_time=0.5, inactivity_timeout=10). Dropped the
"known autoscaler quirk" section now that alpha addresses it; kept the
--session-cost flag as a debugging knob.

worker.py and client.py keep the same behavior but trim long block
comments and multi-line docstrings the code didn't need.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 11:50:03 +01:00
..
2026-05-12 11:50:03 +01:00
2026-05-12 11:50:03 +01:00
2026-05-12 11:50:03 +01:00

Null PyWorker

Holds Vast Serverless reservations open without forwarding any work to a model. Use it when your real workload (a queue consumer in any language) runs as a separate process on the instance and you just want to drive Vast autoscaling: one POST reserves a worker, one POST releases it.

Use case

You have a job queue on your own infrastructure (Redis, SQS, NATS, etc.) and a consumer (node, golang, python, a binary — anything) that pulls from it. You want one Vast worker per unit of in-flight work, scaling elastically from zero. The null PyWorker is the autoscaling driver; your consumer does the work.

How it works

Reservations use the framework's session API. The SDK's endpoint.session(...) POSTs /session/create to reserve a worker; session.close() POSTs /session/end to release it. max_sessions=1 means each worker holds exactly one reservation — the next reservation either lands on a free worker or triggers a scale-up.

The PyWorker itself does nothing functional:

  • One trivial /ping route to satisfy the framework's benchmark requirement (its max_perf is pinned to 100).
  • An internal /release endpoint on 127.0.0.1:18999 for the local consumer to end the session without needing session_auth.

Endpoint parameters

Tested working configuration:

Parameter Value Why
target_util 1.0 One session = one worker. Default 0.9 rounds up to an extra worker.
min_load 0 Scale-to-zero floor.
max_queue_time 1 Stop routing to an occupied worker after ~1s of implied queue.
target_queue_time 0.5 Trigger scale-up promptly once anything queues.
inactivity_timeout 10 (seconds) Permit scale-to-zero after 10s idle.

API

Route Where Use
POST /session/create endpoint, signed Reserve a worker (endpoint.session(...))
POST /session/end endpoint, signed Release (session.close())
POST /release 127.0.0.1:18999, no auth Local consumer release, no session_auth needed

Healthcheck

Default: stub on 127.0.0.1:18999/health returning 200. Set BACKEND_HEALTH_URL=http://127.0.0.1:9090/health (absolute URL) to point the framework at your queue consumer's health endpoint instead — if the consumer dies, the autoscaler sees the worker as broken.

Deploying

  1. Point PYWORKER_REPO at this repo (or your fork).
  2. Set BACKEND=null in the template.
  3. Run your queue consumer alongside the PyWorker. When it's done with a unit of work:
    curl -X POST http://127.0.0.1:18999/release
    

Client demo

# Single reservation
python -m workers.null.client --endpoint <NAME> --instance alpha

# Staggered three-session trapezoid
python -m workers.null.client --endpoint <NAME> --instance alpha --demo

Flags: --duration (single), --interval and --plateau (demo timing), --session-cost (overrides the cost reported at session create; default 100 = max_perf), --instance (prod | alpha | candidate | local).

Environment variables

  • BACKEND_HEALTH_URL — absolute URL the framework healthchecks. Stub is used when unset.
  • NULL_CONTROL_PORT — internal control server port. Defaults to 18999.