pyworker

Files

T

Rob Ballantyne 6c2f194b28 Add perf heartbeat to keep null pyworker reporting peak throughput

While a /reserve is held, no requests complete so workload_served stays
at 0 each metrics tick. The autoscaler sees cur_perf=0 against
max_perf=150, concludes the worker can't deliver claimed throughput,
downgrades it, and gets cautious about scaling up — so additional
/reserve requests pile up behind the held one instead of triggering a
new worker.

Add a 1Hz heartbeat coroutine that, while anything is in flight, sets
workload_served back to TARGET_PERF (150) and flags update_pending. The
metrics tick reads 150 and resets to 0; the heartbeat re-pins it before
the next tick. Net effect: the autoscaler sees a saturated worker
delivering at peak rate, which is the signal it needs to scale a new
worker up rather than queue.

The heartbeat needs the backend instance, which is only created inside
Worker(...) — stash a reference in a module-level dict between Worker()
and .run() so the lifecycle coroutine can reach it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-12 10:35:18 +01:00

ace

Use PyWorker SDK (#67 )

2025-12-15 19:33:03 -08:00

comfyui-json

Use PyWorker SDK (#67 )

2025-12-15 19:33:03 -08:00

null

Add perf heartbeat to keep null pyworker reporting peak throughput

2026-05-12 10:35:18 +01:00

openai

Lowered concurrency of vLLM and TGI benchmarks

2025-12-17 11:55:33 -08:00

tgi

Increase TGI benchmark tokens to 500

2026-04-30 14:04:39 -07:00

wan

Use PyWorker SDK (#67 )

2025-12-15 19:33:03 -08:00