Commit Graph

204 Commits

Author SHA1 Message Date
Rob Ballantyne b52c654f09 comfyui-json: key readiness off api-wrapper's BACKENDS_READY token
Rather than tailing for "Uvicorn running on", which only confirms the
api-wrapper's own HTTP listener is bound, watch for the api-wrapper's
new structured tokens that reflect actual end-to-end reachability:

  MODEL_LOAD_LOG_MSG  = ["BACKENDS_READY"]
  MODEL_ERROR_LOG_MSGS includes:
    - "BACKENDS_READY_TIMEOUT"   (backends never came up)
    - "BACKEND_UNRECOVERABLE"    (CUDA fault latched on a backend)
    - "Application startup failed" (kept; uvicorn's own ASGI failure)

Closes the race observed on a live test where the pyworker fired
benchmark the moment uvicorn bound, every request inside the
api-wrapper hit Cannot-connect-to-host on ComfyUI, and the SDK
counted the resulting fast 502s as a fast worker (perf=200).

Tokens are emitted by ai-dock/comfyui-api-wrapper#11 and onward;
earlier wrapper versions won't emit BACKENDS_READY so warm-up stalls
indefinitely — pin to a wrapper that includes that change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 09:46:45 +01:00
Rob Ballantyne a5bcc3de5e comfyui-json: address PR #85 review
Five issues raised by Copilot's review:

1. _resolve_benchmark_path's docstring/README claim that a set-but-
   broken BENCHMARK_JSON_PATH falls through to the well-known tier,
   but the implementation only handled "file missing". A path
   pointing at a directory or holding malformed JSON dropped
   straight to the SD1.5 fallback without consulting tier 3.
   Replaced with a true tiered try-and-load: walk
   (misc, env, well-known), attempt to load each, and fall through
   to the next on any failure (missing, not a regular file,
   unreadable, invalid JSON). The env-var case still surfaces a
   warning so a typo doesn't fail silently.

2. int(os.getenv("BENCHMARK_TEST_WIDTH", ...)) crashed on non-int
   values. Added _env_int helper that warns + returns default on
   ValueError. Empty string also handled.

3. random.choice([]) on an empty test_prompts.txt raised IndexError.
   _load_prompts now warns + uses a built-in _FALLBACK_PROMPT when
   the file is missing or yields no non-blank lines.

4. README already claimed "missing or unreadable" fall-through; the
   refactor in (1) makes the code match. No README change needed.

5. test_prompts.txt restored verbatim from the pre-rewrite tree
   carried real-person and IP-laden prompts (Pope Francis, Iron Man,
   Luke Skywalker, "Disney socialite"). Used automatically during
   warm-up they're a reputational/safety-filter risk for the worker.
   Replaced with generic equivalents that exercise the same workload
   characteristics (1 elderly figure on motorcycle, 1 armoured hero
   with axe, etc.).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:25:21 +01:00
Rob Ballantyne cecf0236fa comfyui-json: watch api-wrapper.log for readiness
Switch MODEL_LOG_FILE from /var/log/portal/comfyui.log to
/var/log/portal/api-wrapper.log and MODEL_LOAD_LOG_MSG to "Uvicorn
running on". A live test instance showed the previous setup firing
benchmark on ComfyUI's "To see the GUI go to:" line, which races
api-wrapper.sh: that script runs convert-workflows.sh (which itself
waits for ComfyUI ready and then converts workflows for several
seconds) before launching uvicorn. The benchmark hit a closed port
on :18288 and the SDK's __call_backend has no retry on connection
refused, locking the worker into a permanent error state.

Watching the api-wrapper log instead means the benchmark only fires
after uvicorn is bound and the pyworker_benchmark.json symlink is
already in place — no SDK changes required.

Trim MODEL_ERROR_LOG_MSGS down to "Application startup failed". The
old patterns were ComfyUI-specific (won't appear in api-wrapper.log)
and dangerous: ModelError is fatal, so "Value not in list:" matching
on an api-wrapper-style log would let one malformed client request
kill the worker. CUDA OOM is similarly off-limits (indistinguishable
from a too-greedy client request via substring match; the benchmark-
failure path already catches model-load OOM at boot). Empty
MODEL_INFO_LOG_MSGS — the prior ComfyUI download pattern can never
match this log file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 12:46:17 +01:00
Rob Ballantyne 09917a9c88 Revert "Wait briefly for the well-known benchmark symlink"
This reverts commit 9d7371ddba.
2026-05-07 12:03:19 +01:00
Rob Ballantyne 9d7371ddba Wait briefly for the well-known benchmark symlink
The pyworker and convert-workflows.sh both unblock when ComfyUI is
ready, but conversion takes a few seconds longer — without a wait, the
first benchmark loses the race and silently drops to the SD1.5 fallback.

Wait up to BENCHMARK_WAIT_TIMEOUT (default 30s) for the symlink before
giving up. The wait fires only when we're actually about to use the
well-known tier (env var / misc/ paths short-circuit), only once per
process, and is skipped entirely off the base image (parent directory
absent), so non-base-image deployments don't pay the timeout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:59:30 +01:00
Rob Ballantyne 381a39f201 Add well-known fallback path for benchmark.json
Read /opt/comfyui-api-wrapper/workflows/pyworker_benchmark.json when
neither misc/benchmark.json nor $BENCHMARK_JSON_PATH yields a usable
file. The vast.ai ComfyUI base image's convert-workflows.sh maintains
that path as a symlink to the first provisioned workflow, so on that
image the operator does not need to set BENCHMARK_JSON_PATH at all.

A set-but-broken $BENCHMARK_JSON_PATH now warns and falls through to
the well-known path instead of dropping straight to the SD1.5 fallback,
so a typo in the env var doesn't mask an otherwise-working benchmark.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:54:20 +01:00
Rob Ballantyne a634ba07a6 Support BENCHMARK_JSON_PATH for provisioning-supplied benchmarks
start_server.sh clones pyworker into /workspace/vast-pyworker after the
provisioning phase has run, so a provisioning script that wants to ship
a custom benchmark workflow cannot write to misc/benchmark.json — that
path doesn't exist yet at provisioning time, and pre-creating it would
make the subsequent clone fail.

Allow provisioning to drop the workflow anywhere (e.g. /workspace) and
point the worker at it via the BENCHMARK_JSON_PATH env var. The in-tree
file still takes precedence (so forks with a baked-in benchmark keep
working unchanged); the env var is consulted only as a second choice,
and a misconfigured path logs a warning rather than silently degrading
to the SD1.5 fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:24:14 +01:00
Rob Ballantyne 2dd4f7fc38 Restore benchmark.json loading in comfyui-json worker
The "Use PyWorker SDK" rewrite (4380d98) replaced the dynamic
ComfyWorkflowData.for_test() benchmark logic with a hardcoded list of 11
SD1.5 Text2Image payloads, dropped misc/benchmark.json.example and
misc/test_prompts.txt, and stopped honouring the BENCHMARK_TEST_*
environment variables. The README's documented behaviour (custom
workflow via benchmark.json, env-var-tuned fallback) had no
implementation behind it.

Restore the original two-tier behaviour against the new SDK by passing
BenchmarkConfig(generator=make_benchmark_payload) instead of a static
dataset, splitting the load logic into a custom-workflow path and a
fallback path, and re-shipping the misc/ assets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:06:34 +01:00
Lucas Armand 9bc9ba11c5 Increase TGI benchmark tokens to 500 2026-04-30 14:04:39 -07:00
LucasArmandVast 48fdc65e3d Update to vastai package (#84) 2026-04-14 10:41:31 -07:00
LucasArmandVast 2cd97315cd Add nltk requirement for openai worker (#83)
* Add nltk requirement for openai worker

* pin version
2026-04-13 11:30:06 -07:00
Lucas Armand 83c31e25a9 Add force update detection 2026-03-31 13:46:22 -07:00
Lucas Armand fbe1dca6fa more env_path fixes 2026-03-30 16:28:51 -07:00
Lucas Armand 4c3120dbc5 allow override env_path 2026-03-30 16:25:01 -07:00
Lucas Armand d7d9b915f6 allow break system packages 2026-03-30 16:09:17 -07:00
Lucas Armand 4660b337fb Check for USE_SYSTEM_PYTHON 2026-03-30 14:46:38 -07:00
edgaratvast 7506ecb6b5 directly invoke one stop shop setup executable exported by vastai pip package for deployments (#82) 2026-03-26 10:59:49 -07:00
LucasArmandVast 50633c5003 Update deployments script with retries. (#81) 2026-03-23 14:58:32 -07:00
LucasArmandVast 2e8f18276f Add beta deployments script (#80) 2026-03-23 14:14:06 -07:00
Scott Darden eba9c480eb Merge pull request #79 from vast-ai/update-requirements
Updated requirements to only require vastai-sdk
2026-01-14 12:07:33 -08:00
Lucas Armand aaca1c9645 Updated requirements to only require vastai-sdk 2026-01-14 10:47:07 -08:00
LucasArmandVast f319db6bd5 flag for model log rotate (#78) 2026-01-12 17:03:18 -08:00
LucasArmandVast 4d786b4d17 SDK Versioning Improvements (#77)
* Add SDK_BRANCH
2026-01-02 10:23:07 -08:00
LucasArmandVast bd3e0032a1 Add SDK version checking (#76) 2025-12-17 21:01:52 -08:00
Lucas Armand e02f4bc943 Lowered concurrency of vLLM and TGI benchmarks 2025-12-17 11:55:33 -08:00
Lucas Armand bcb04b9a32 add missing comma 2025-12-17 11:40:40 -08:00
Lucas Armand 9daf171487 Increase queue limits for vLLM and TGI 2025-12-17 11:38:55 -08:00
LucasArmandVast 29f836eb1a Backwards compatible vLLM payload (#75)
* Support old vLLM payloads
2025-12-15 19:58:02 -08:00
LucasArmandVast 4380d98c01 Use PyWorker SDK (#67)
* Change PyWorker to Worker SDK
* Moved /lib to vast-sdk (https://github.com/vast-ai/vast-sdk)
2025-12-15 19:33:03 -08:00
Abiola Akinnubi 2ce741a8b7 Merge pull request #74 from vast-ai/AUTO-912
Mark pyworkers as "Error" if startup script fails. to avoid silent fail that waits for autoscaler.
2025-12-11 17:05:13 -08:00
Abiola Akinnubi 4ecc07032f Mark pyworkers as "Error" if startup script fails. to avoid silent fail that waits for autoscaler. 2025-12-11 12:51:56 -08:00
edgaratvast df61e6e946 correct version pin for aiohttp (#73)
Co-authored-by: Edgar Lin <edgarlin2000@gmail.com>
2025-12-10 19:34:52 -08:00
LucasArmandVast 70f8a8f534 Merge pull request #72 from vast-ai/hotfix-pin-pycares
Hotfix: pin pycares
2025-12-10 20:41:44 -05:00
Lucas Armand 7be8aa6397 pin pycares 2025-12-10 17:38:03 -08:00
Colter-Downing 138fc3ac47 Merge pull request #71 from vast-ai/AUTO-comfyui-updates
Auto comfyui updates
2025-12-04 10:55:12 -08:00
Colter Downing 222ac2a0dd default endpoint name 2025-12-04 10:54:55 -08:00
Colter Downing 40aed9b5f8 adding s3 as an option 2025-12-04 10:52:57 -08:00
Colter Downing d4d36bf86e done with comfy updates 2025-12-03 20:45:55 -08:00
Colter Downing e839cfc6e8 include view in API wrapper 2025-12-03 20:22:45 -08:00
Colter Downing f04138e13b update to be able to get images 2025-12-03 20:16:25 -08:00
Colter-Downing de3aa87c8f Merge pull request #70 from vast-ai/AUTO-tgi-client-edits
update tgi client
2025-12-03 18:40:01 -08:00
Colter Downing 6b5b1341a7 update tgi client 2025-12-03 18:38:42 -08:00
Colter-Downing 8be92c03de Merge pull request #69 from vast-ai/AUTO-874--fix-openai-worker-client
defaults to ENDPOINT_NAME and DEFAULT_MODEL but uses the flag first
2025-12-03 16:59:56 -08:00
Colter Downing adedb8ba90 defaults to ENDPOINT_NAME and DEFAULT_MODEL but uses the flag first if present 2025-12-03 16:57:28 -08:00
LucasArmandVast 2f543c01ad Merge pull request #68 from vast-ai/fix-vllm-concurrency
Increase model wait time for vLLM
2025-12-03 16:13:51 -05:00
Lucas Armand 0bcd2219ea Increase model wait time for vLLM 2025-12-03 12:38:52 -08:00
LucasArmandVast 0339b471c5 Merge pull request #66 from vast-ai/synthesis
PyWorker Error Handling
2025-11-25 16:02:26 -08:00
Lucas Armand e143162438 bumpy pyworker version 2025-11-25 16:01:23 -08:00
Lucas Armand 7986e51e9e early errors 2025-11-24 15:24:06 -08:00
Lucas Armand 9c6ab78503 Move model log line 2025-11-24 15:22:23 -08:00