Commit Graph

7 Commits

Author SHA1 Message Date
Rob Ballantyne cecf0236fa comfyui-json: watch api-wrapper.log for readiness
Switch MODEL_LOG_FILE from /var/log/portal/comfyui.log to
/var/log/portal/api-wrapper.log and MODEL_LOAD_LOG_MSG to "Uvicorn
running on". A live test instance showed the previous setup firing
benchmark on ComfyUI's "To see the GUI go to:" line, which races
api-wrapper.sh: that script runs convert-workflows.sh (which itself
waits for ComfyUI ready and then converts workflows for several
seconds) before launching uvicorn. The benchmark hit a closed port
on :18288 and the SDK's __call_backend has no retry on connection
refused, locking the worker into a permanent error state.

Watching the api-wrapper log instead means the benchmark only fires
after uvicorn is bound and the pyworker_benchmark.json symlink is
already in place — no SDK changes required.

Trim MODEL_ERROR_LOG_MSGS down to "Application startup failed". The
old patterns were ComfyUI-specific (won't appear in api-wrapper.log)
and dangerous: ModelError is fatal, so "Value not in list:" matching
on an api-wrapper-style log would let one malformed client request
kill the worker. CUDA OOM is similarly off-limits (indistinguishable
from a too-greedy client request via substring match; the benchmark-
failure path already catches model-load OOM at boot). Empty
MODEL_INFO_LOG_MSGS — the prior ComfyUI download pattern can never
match this log file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:46:17 +01:00
Rob Ballantyne 09917a9c88 Revert "Wait briefly for the well-known benchmark symlink"
This reverts commit 9d7371ddba.
2026-05-07 12:03:19 +01:00
Rob Ballantyne 9d7371ddba Wait briefly for the well-known benchmark symlink
The pyworker and convert-workflows.sh both unblock when ComfyUI is
ready, but conversion takes a few seconds longer — without a wait, the
first benchmark loses the race and silently drops to the SD1.5 fallback.

Wait up to BENCHMARK_WAIT_TIMEOUT (default 30s) for the symlink before
giving up. The wait fires only when we're actually about to use the
well-known tier (env var / misc/ paths short-circuit), only once per
process, and is skipped entirely off the base image (parent directory
absent), so non-base-image deployments don't pay the timeout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:59:30 +01:00
Rob Ballantyne 381a39f201 Add well-known fallback path for benchmark.json
Read /opt/comfyui-api-wrapper/workflows/pyworker_benchmark.json when
neither misc/benchmark.json nor $BENCHMARK_JSON_PATH yields a usable
file. The vast.ai ComfyUI base image's convert-workflows.sh maintains
that path as a symlink to the first provisioned workflow, so on that
image the operator does not need to set BENCHMARK_JSON_PATH at all.

A set-but-broken $BENCHMARK_JSON_PATH now warns and falls through to
the well-known path instead of dropping straight to the SD1.5 fallback,
so a typo in the env var doesn't mask an otherwise-working benchmark.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:54:20 +01:00
Rob Ballantyne a634ba07a6 Support BENCHMARK_JSON_PATH for provisioning-supplied benchmarks
start_server.sh clones pyworker into /workspace/vast-pyworker after the
provisioning phase has run, so a provisioning script that wants to ship
a custom benchmark workflow cannot write to misc/benchmark.json — that
path doesn't exist yet at provisioning time, and pre-creating it would
make the subsequent clone fail.

Allow provisioning to drop the workflow anywhere (e.g. /workspace) and
point the worker at it via the BENCHMARK_JSON_PATH env var. The in-tree
file still takes precedence (so forks with a baked-in benchmark keep
working unchanged); the env var is consulted only as a second choice,
and a misconfigured path logs a warning rather than silently degrading
to the SD1.5 fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:24:14 +01:00
Rob Ballantyne 2dd4f7fc38 Restore benchmark.json loading in comfyui-json worker
The "Use PyWorker SDK" rewrite (4380d98) replaced the dynamic
ComfyWorkflowData.for_test() benchmark logic with a hardcoded list of 11
SD1.5 Text2Image payloads, dropped misc/benchmark.json.example and
misc/test_prompts.txt, and stopped honouring the BENCHMARK_TEST_*
environment variables. The README's documented behaviour (custom
workflow via benchmark.json, env-var-tuned fallback) had no
implementation behind it.

Restore the original two-tier behaviour against the new SDK by passing
BenchmarkConfig(generator=make_benchmark_payload) instead of a static
dataset, splitting the load logic into a custom-workflow path and a
fallback path, and re-shipping the misc/ assets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:06:34 +01:00
LucasArmandVast 4380d98c01 Use PyWorker SDK (#67)
* Change PyWorker to Worker SDK
* Moved /lib to vast-sdk (https://github.com/vast-ai/vast-sdk)
2025-12-15 19:33:03 -08:00