null-worker/workers/null/README.md

# Null PyWorker

Holds Vast Serverless reservations open without forwarding any work to a
model. Use it when your real workload (a queue consumer in any language)
runs as a separate process on the instance and you just want to drive
Vast autoscaling: **one POST reserves a worker, one POST releases it.**

## Use case

You have a job queue on your own infrastructure (Redis, SQS, NATS, etc.)
and a consumer (node, golang, python, a binary — anything) that pulls
from it. You want one Vast worker per unit of in-flight work, scaling
elastically from zero. The null PyWorker is the autoscaling driver; your
consumer does the work.

## How it works

Reservations use the framework's session API. The SDK's
`endpoint.session(...)` POSTs `/session/create` to reserve a worker;
`session.close()` POSTs `/session/end` to release it. `max_sessions=1`
means each worker holds exactly one reservation — the next reservation
either lands on a free worker or triggers a scale-up.

The PyWorker itself does nothing functional:

- One trivial `/ping` route to satisfy the framework's benchmark
  requirement (its `max_perf` is pinned to 100).
- An internal `/release` endpoint on `127.0.0.1:18999` for the local
  consumer to end the session without needing `session_auth`.

## Endpoint parameters

Tested working configuration:

| Parameter | Value | Why |
|---|---|---|
| `target_util` | `1.0` | One session = one worker. Default `0.9` rounds up to an extra worker. |
| `min_load` | `0` | Scale-to-zero floor. |
| `max_queue_time` | `1` | Stop routing to an occupied worker after ~1s of implied queue. |
| `target_queue_time` | `0.5` | Trigger scale-up promptly once anything queues. |
| `inactivity_timeout` | `10` (seconds) | Permit scale-to-zero after 10s idle. |

## API

| Route | Where | Use |
|---|---|---|
| `POST /session/create` | endpoint, signed | Reserve a worker (`endpoint.session(...)`) |
| `POST /session/end` | endpoint, signed | Release (`session.close()`) |
| `POST /release` | `127.0.0.1:18999`, no auth | Local consumer release, no `session_auth` needed |

## Healthcheck

Default: stub on `127.0.0.1:18999/health` returning `200`. Set
`BACKEND_HEALTH_URL=http://127.0.0.1:9090/health` (absolute URL) to point
the framework at your queue consumer's health endpoint instead — if the
consumer dies, the autoscaler sees the worker as broken.

## Deploying

1. Point `PYWORKER_REPO` at this repo (or your fork).
2. Set `BACKEND=null` in the template.
3. Run your queue consumer alongside the PyWorker. When it's done with
   a unit of work:
   ```bash
   curl -X POST http://127.0.0.1:18999/release
   ```

## Client demo

```bash
# Single reservation, hold 180s
python -m workers.null.client --endpoint <NAME> --instance alpha

# Three concurrent reservations, started 30s apart, each held 360s
python -m workers.null.client --endpoint <NAME> --instance alpha --count 3 --hold 360
```

Flags: `--count` (number of concurrent sessions, default 1), `--hold`
(seconds each session is held, default 180), `--interval` (seconds
between starts when `--count > 1`, default 30), `--cost` (cost reported
at session-create, default 100 = `max_perf`), `--instance` (`prod` |
`alpha` | `candidate` | `local`).

## Environment variables

- `BACKEND_HEALTH_URL` — absolute URL the framework healthchecks. Stub
  is used when unset.
- `NULL_CONTROL_PORT` — internal control server port. Defaults to `18999`.
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00			`# Null PyWorker`

Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`Holds Vast Serverless reservations open without forwarding any work to a`
			`model. Use it when your real workload (a queue consumer in any language)`
			`runs as a separate process on the instance and you just want to drive`
			`Vast autoscaling: one POST reserves a worker, one POST releases it.`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`## Use case`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`You have a job queue on your own infrastructure (Redis, SQS, NATS, etc.)`
			`and a consumer (node, golang, python, a binary — anything) that pulls`
			`from it. You want one Vast worker per unit of in-flight work, scaling`
			`elastically from zero. The null PyWorker is the autoscaling driver; your`
			`consumer does the work.`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
			`## How it works`

Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`Reservations use the framework's session API. The SDK's`
			`endpoint.session(...)` POSTs `/session/create` to reserve a worker;
			`session.close()` POSTs `/session/end` to release it. `max_sessions=1`
			`means each worker holds exactly one reservation — the next reservation`
			`either lands on a free worker or triggers a scale-up.`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`The PyWorker itself does nothing functional:`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			- One trivial `/ping` route to satisfy the framework's benchmark
			requirement (its `max_perf` is pinned to 100).
			- An internal `/release` endpoint on `127.0.0.1:18999` for the local
			consumer to end the session without needing `session_auth`.
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`## Endpoint parameters`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`Tested working configuration:`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`\| Parameter \| Value \| Why \|`
			`\|---\|---\|---\|`
			\| `target_util` \| `1.0` \| One session = one worker. Default `0.9` rounds up to an extra worker. \|
			\| `min_load` \| `0` \| Scale-to-zero floor. \|
			\| `max_queue_time` \| `1` \| Stop routing to an occupied worker after ~1s of implied queue. \|
			\| `target_queue_time` \| `0.5` \| Trigger scale-up promptly once anything queues. \|
			\| `inactivity_timeout` \| `10` (seconds) \| Permit scale-to-zero after 10s idle. \|
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`## API`
Add /release control endpoint to null pyworker 2026-05-11 16:59:46 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`\| Route \| Where \| Use \|`
			`\|---\|---\|---\|`
			\| `POST /session/create` \| endpoint, signed \| Reserve a worker (`endpoint.session(...)`) \|
			\| `POST /session/end` \| endpoint, signed \| Release (`session.close()`) \|
			\| `POST /release` \| `127.0.0.1:18999`, no auth \| Local consumer release, no `session_auth` needed \|
Add /release control endpoint to null pyworker 2026-05-11 16:59:46 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`## Healthcheck`
Add /release control endpoint to null pyworker 2026-05-11 16:59:46 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			Default: stub on `127.0.0.1:18999/health` returning `200`. Set
			`BACKEND_HEALTH_URL=http://127.0.0.1:9090/health` (absolute URL) to point
			`the framework at your queue consumer's health endpoint instead — if the`
			`consumer dies, the autoscaler sees the worker as broken.`
Add /release control endpoint to null pyworker 2026-05-11 16:59:46 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`## Deploying`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			1. Point `PYWORKER_REPO` at this repo (or your fork).
			2. Set `BACKEND=null` in the template.
			`3. Run your queue consumer alongside the PyWorker. When it's done with`
			`a unit of work:`
Add /release control endpoint to null pyworker 2026-05-11 16:59:46 +01:00			```bash
			`curl -X POST http://127.0.0.1:18999/release`
			```
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`## Client demo`
Revert default session cost to 100; document the over-provision as a workaround 2026-05-12 11:34:52 +01:00
			```bash
Collapse null pyworker client to a single mode parameterized by --count 2026-05-12 12:18:33 +01:00			`# Single reservation, hold 180s`
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`python -m workers.null.client --endpoint <NAME> --instance alpha`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00
Collapse null pyworker client to a single mode parameterized by --count 2026-05-12 12:18:33 +01:00			`# Three concurrent reservations, started 30s apart, each held 360s`
			`python -m workers.null.client --endpoint <NAME> --instance alpha --count 3 --hold 360`
Add null pyworker for queue-driven autoscaling 2026-05-11 16:48:52 +01:00			```

Collapse null pyworker client to a single mode parameterized by --count 2026-05-12 12:18:33 +01:00			Flags: `--count` (number of concurrent sessions, default 1), `--hold`
			(seconds each session is held, default 180), `--interval` (seconds
			between starts when `--count > 1`, default 30), `--cost` (cost reported
			at session-create, default 100 = `max_perf`), `--instance` (`prod` \|
			`alpha` \| `candidate` \| `local`).
Add staggered --demo mode to null pyworker client 2026-05-11 17:08:44 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			`## Environment variables`
Add staggered --demo mode to null pyworker client 2026-05-11 17:08:44 +01:00
Simplify null pyworker code and docs 2026-05-12 11:50:03 +01:00			- `BACKEND_HEALTH_URL` — absolute URL the framework healthchecks. Stub
			`is used when unset.`
			- `NULL_CONTROL_PORT` — internal control server port. Defaults to `18999`.