comfyui-json: key readiness off api-wrapper's BACKENDS_READY token

Rather than tailing for "Uvicorn running on", which only confirms the api-wrapper's own HTTP listener is bound, watch for the api-wrapper's new structured tokens that reflect actual end-to-end reachability: MODEL_LOAD_LOG_MSG = ["BACKENDS_READY"] MODEL_ERROR_LOG_MSGS includes: - "BACKENDS_READY_TIMEOUT" (backends never came up) - "BACKEND_UNRECOVERABLE" (CUDA fault latched on a backend) - "Application startup failed" (kept; uvicorn's own ASGI failure) Closes the race observed on a live test where the pyworker fired benchmark the moment uvicorn bound, every request inside the api-wrapper hit Cannot-connect-to-host on ComfyUI, and the SDK counted the resulting fast 502s as a fast worker (perf=200). Tokens are emitted by ai-dock/comfyui-api-wrapper#11 and onward; earlier wrapper versions won't emit BACKENDS_READY so warm-up stalls indefinitely — pin to a wrapper that includes that change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
comfyui-json: address PR #85 review
2026-05-08 09:46:45 +01:00 · 2026-05-07 18:25:21 +01:00 · 2026-05-07 12:46:17 +01:00 · 2026-05-07 12:03:19 +01:00 · 2026-05-07 11:59:30 +01:00 · 2026-05-07 11:54:20 +01:00
6 changed files with 356 additions and 85 deletions
@@ -1,37 +0,0 @@
-// .devcontainer/devcontainer.json
-// Dev container for the Vast.ai serverless Ollama template.
-// Includes Docker-in-Docker so you can build and test images from inside the container.
-{
-  "name": "vast.ai-serverless-ollama",
-  "image": "mcr.microsoft.com/devcontainers/base:trixie",
-  "features": {
-    "ghcr.io/devcontainers/features/python:1": {
-      "installTools": true,
-      "version": "3.12"
-    },
-    "ghcr.io/devcontainers/features/docker-in-docker:3.0.0": {
-      "moby": false,
-      "version": "latest",
-      "installDockerBuildx": true,
-      "dockerDashComposeVersion": "v2"
-    }
-  },
-  "runArgs": ["--privileged"],
-  "containerEnv": {
-    "DOCKER_BUILDKIT": "1"
-  },
-  "postCreateCommand": "python3 -m pip install --user --upgrade pip && python3 -m pip install --user -r requirements.txt pyyaml",
-  "customizations": {
-    "vscode": {
-      "extensions": [
-        "ms-python.python",
-        "ms-azuretools.vscode-docker"
-      ],
-      "settings": {
-        "python.defaultInterpreterPath": "/usr/bin/python3",
-        "terminal.integrated.defaultProfile.linux": "bash",
-        "docker.showStartPage": false
-      }
-    }
-  }
-}
@@ -104,13 +104,17 @@ Images will be saved locally AND uploaded to `s3://{bucket}/comfyui/{filename}`.

 ### Custom Benchmark Workflows

-You can provide a custom ComfyUI workflow for benchmarking by creating `workers/comfyui-json/misc/benchmark.json`. This allows you to test performance using your preferred models and workflow complexity.
+You can provide a custom ComfyUI workflow for benchmarking. This allows you to test performance using your preferred models and workflow complexity.

-**Ways to provide the benchmark file:**
- Fork this repository and add your `benchmark.json` file
- Write the file during worker provisioning (onstart script or setup phase)
+**Ways to provide the benchmark file** (in resolution order — first match wins):

-An example file is provided in the repository. To ensure varied generations, use the placeholder `__RANDOM_INT__` in place of static seed values - it will be replaced with a random integer for each benchmark run.
+1. **Fork this repository** and commit your workflow to `workers/comfyui-json/misc/benchmark.json`.
+2. **Write the file during provisioning** to a path *outside* the pyworker tree (e.g. `/workspace/benchmark.json`) and export `BENCHMARK_JSON_PATH` so the worker can find it. The pyworker repo is cloned by `start_server.sh` *after* provisioning runs, so provisioning cannot write into `misc/` directly — the destination would be clobbered, or the clone would fail.
+3. **Run on the vast.ai ComfyUI base image.** Its `convert-workflows.sh` maintains `/opt/comfyui-api-wrapper/workflows/pyworker_benchmark.json` as a symlink to the first provisioned workflow; the worker reads this automatically when neither of the above is set. No env var required.
+
+If `BENCHMARK_JSON_PATH` is set but points at a missing or unreadable file, the worker logs a warning and falls through to the next tier rather than going straight to the SD1.5 fallback.
+
+An example workflow is provided at `workers/comfyui-json/misc/benchmark.json.example`. To ensure varied generations, use the placeholder `__RANDOM_INT__` in place of static seed values — it will be replaced with a random integer for each benchmark run.

 ### Default Benchmark (Fallback)

@@ -120,9 +124,10 @@ The default benchmark uses Stable Diffusion v1.5 with ComfyUI's standard text-to

 | Environment Variable | Default Value | Description |
 | -------------------- | ------------- | ----------- |
-| BENCHMARK_TEST_WIDTH | 512 | Image width (pixels) |
-| BENCHMARK_TEST_HEIGHT | 512 | Image height (pixels) |
-| BENCHMARK_TEST_STEPS | 20 | Number of denoising steps |
+| BENCHMARK_JSON_PATH | (unset) | Path to a custom workflow file outside the pyworker tree. Used if `misc/benchmark.json` is absent. Falls through to `/opt/comfyui-api-wrapper/workflows/pyworker_benchmark.json` if set but missing. |
+| BENCHMARK_TEST_WIDTH | 512 | Fallback benchmark: image width (pixels) |
+| BENCHMARK_TEST_HEIGHT | 512 | Fallback benchmark: image height (pixels) |
+| BENCHMARK_TEST_STEPS | 20 | Fallback benchmark: number of denoising steps |

 Each benchmark run uses a random prompt from `misc/test_prompts.txt` and a random seed to ensure consistent GPU load patterns.

@@ -0,0 +1,107 @@
+{
+    "3": {
+        "inputs": {
+            "seed": "__RANDOM_INT__",
+            "steps": 20,
+            "cfg": 8,
+            "sampler_name": "euler",
+            "scheduler": "normal",
+            "denoise": 1,
+            "model": [
+            "4",
+            0
+            ],
+            "positive": [
+            "6",
+            0
+            ],
+            "negative": [
+            "7",
+            0
+            ],
+            "latent_image": [
+            "5",
+            0
+            ]
+        },
+        "class_type": "KSampler",
+        "_meta": {
+            "title": "KSampler"
+        }
+    },
+    "4": {
+        "inputs": {
+            "ckpt_name": "v1-5-pruned-emaonly-fp16.safetensors"
+        },
+        "class_type": "CheckpointLoaderSimple",
+        "_meta": {
+            "title": "Load Checkpoint"
+        }
+    },
+    "5": {
+        "inputs": {
+            "width": 512,
+            "height": 512,
+            "batch_size": 1
+        },
+        "class_type": "EmptyLatentImage",
+        "_meta": {
+            "title": "Empty Latent Image"
+        }
+    },
+    "6": {
+        "inputs": {
+            "text": "beautiful scenery nature glass bottle landscape, , purple galaxy bottle,",
+            "clip": [
+            "4",
+            1
+            ]
+        },
+        "class_type": "CLIPTextEncode",
+        "_meta": {
+            "title": "CLIP Text Encode (Prompt)"
+        }
+    },
+    "7": {
+        "inputs": {
+            "text": "text, watermark",
+            "clip": [
+            "4",
+            1
+            ]
+        },
+        "class_type": "CLIPTextEncode",
+        "_meta": {
+            "title": "CLIP Text Encode (Prompt)"
+        }
+    },
+    "8": {
+        "inputs": {
+            "samples": [
+            "3",
+            0
+            ],
+            "vae": [
+            "4",
+            2
+            ]
+        },
+        "class_type": "VAEDecode",
+        "_meta": {
+            "title": "VAE Decode"
+        }
+    },
+    "9": {
+        "inputs": {
+            "filename_prefix": "ComfyUI",
+            "images": [
+            "8",
+            0
+            ]
+        },
+        "class_type": "SaveImage",
+        "_meta": {
+            "title": "Save Image"
+        }
+    }
+}
@@ -0,0 +1,34 @@
+cartoon character of a person with a hoodie , in style of cytus and deemo, ork, gold chains, realistic anime cat, dripping black goo, lineage revolution style, thug life, cute anthropomorphic bunny, balrog, arknights, aliased, very buff, black and red and yellow paint, painting illustration collage style, character composition in vector with white background
+stardew valley, fine details
+2D Vector Illustration of a child with soccer ball Art for Sublimation, Design Art, Chrome Art, Painting and Stunning Artwork, Highly Detailed Digital Painting, Airbrush Art, Highly Detailed Digital Artwork, Dramatic Artwork, stained antique yellow copper paint, digital airbrush art, detailed by Mark Brooks, Chicano airbrush art, Swagger! snake Culture
+realistic futuristic city-downtown with short buildings, sunset
+seascape by Ray Collins and artgerm, front view of a perfect wave, sunny background, ultra detailed water
+inspired by realflow-cinema4d editor features, create image of a transparent luxury cup with ice fruits and mint, connected with white, yellow and pink cream, Slow - High Speed MO Photography, YouTube Video Screenshot, Abstract Clay, Transparent Cup , molecular gastronomy, wheel, 3D fluid,Simulation rendering, still video, 4k polymer clay futras photography, very surreal, Houdini Fluid Simulation, hyperrealistic CGI and FLUIDS & MULTIPHYSICS SIMULATION effect, with Somali Stain Lurex, Metallic Jacquard, Gold Thread, Mulberry Silk, Toub Saree, Warm background, a fantastic image worthy of an award.
+biker with backpack on his back riding a motorcycle, Style by Ade Santora, Oilpunk, Cover photo, craig mullins style, on the cover of a magazine, Outdoor Magazine, inspired by Alex Petruk APe, image of a male biker, Cover of an award-winning magazine, the man has a backpack, photo for magazine, with a backpack, magazine cover
+generate a collage-style illustration inspired by the Procreate raster graphic editor, photographic illustration with the theme, 2D vector, art for textile sublimation, containing surrealistic cartoon cat wearing a baseball cap and jeans standing in front of a poster, inspired by Sadao Watanabe, Doraemon, Japanese cartoon style, Eichiro Oda, Iconic high detail character, Director: Nakahara Nantenbō, Kastuhiro Otomo, image detailed, by Miyamoto, Hidetaka Miyazaki, Katsuhiro illustration, 8k, masterpiece, Minimize noise and grain in photo quality without lose quality and increase brightness and lighting,Symmetry and Alignment, Avoid asymmetrical shapes and out-of-focus points. Focus and Sharpness: Make sure the image is focused and sharp and encourages the viewer to see it as a work of art printed on fabric.
+fantasy medieval village world inside a glass sphere , high detail, fantasy, realistic, light effect, hyper detail, volumetric lighting, cinematic, macro, depth of field, blur, red light and clouds from the back, highly detailed epic cinematic concept art cg render made in maya, blender and photoshop, octane render, excellent composition, dynamic dramatic cinematic lighting, aesthetic, very inspirational, world inside a glass sphere by james gurney by artgerm with james jean, joe fenton and tristan eaton by ross tran, fine details
+armored hero with a glowing axe, mecha science_fiction, jungle background, dynamic lighting, detailed shading, digital texture painting, masterpiece, studio quality, 6k
+elderly figure in a leather jacket DJing in a smoky nightclub, mixing live on a giant console, dramatic stage lighting, a masterpiece
+elderly figure in a leather jacket on a motorcycle, magazine cover lighting, a masterpiece
+a young pilot ordering a burger and fries from a futuristic space cantina
+I want to generate a group avatar for a Feishu group chat. The role of this group is daily software technical communication. Now the subject technology stacks that members of this group discuss daily include: algorithms, data structures, optimization, functional programming, and the programming languages often discussed are: TypeScript, Java, python, etc. I hope this avatar has a simple aesthetic, this avatar is a single person avatar
+portrait Anime black girl cute-fine-face, pretty face, realistic shaded Perfect face, fine details. Anime. realistic shaded lighting by Ilya Kuvshinov Giuseppe Dangelico Pino and Michael Garmash and Rob Rey, IAMAG premiere, WLOP matte print, cute freckles, masterpiece
+young woman in modern fashion editorial, beige miniskirt and dark brown turtleneck sweater, soft studio lighting, brown hair, grey eyes, fine details, magazine cover style, a masterpiece
+Cute small cat sitting in a movie theater eating chicken wiggs watching a movie ,unreal engine, cozy indoor lighting, artstation, detailed, digital painting,cinematic,character design by mark ryden and pixar and hayao miyazaki, unreal 5, daz, hyperrealistic, octane render
+Cute small dog sitting in a movie theater eating popcorn watching a movie ,unreal engine, cozy indoor lighting, artstation, detailed, digital painting,cinematic,character design by mark ryden and pixar and hayao miyazaki, unreal 5, daz, hyperrealistic, octane render
+fox bracelet made of buckskin with fox features, rich details, fine carvings, studio lighting
+crane buckskin bracelet with crane features, rich details, fine carvings, studio lighting
+london luxurious interior living-room, light walls
+Parisian luxurious interior penthouse bedroom, dark walls, wooden panels
+cute girl, crop-top, blond hair, black glasses, stretching, with background by greg rutkowski makoto shinkai kyoto animation key art feminine mid shot
+houses in front, houses background, straight houses, digital art, smooth, sharp focus, gravity falls style, doraemon style, shinchan style, anime style
+Simplified technical drawing, Leonardo da Vinci, Mechanical Dinosaur Skeleton, Minimalistic annotations, Hand-drawn illustrations, Basic design and engineering, Wonder and curiosity
+High quality 8K painting impressionist style of a Japanese modern city street with a girl on the foreground wearing a traditional wedding dress with a fox mask, staring at the sky, daylight
+a landscape from the Moon with the Earth setting on the horizon, realistic, detailed
+Isometric Atlantis city,great architecture with columns, great details, ornaments,seaweed, blue ambiance, 3D cartoon style, soft light, 45° view
+A hyper realistic avatar of a guy riding on a black honda cbr 650r in leather suit,high detail, high quality,8K,photo realism
+the street of amedieval fantasy town, at dawn, dark, highly detailed
+overwhelmingly beautiful eagle framed with vector flowers, long shiny wavy flowing hair, polished, ultra detailed vector floral illustration mixed with hyper realism, muted pastel colors, vector floral details in background, muted colors, hyper detailed ultra intricate overwhelming realism in detailed complex scene with magical fantasy atmosphere, no signature, no watermark
+a highly detailed matte painting of a man on a hill watching a rocket launch in the distance by studio ghibli, makoto shinkai, by artgerm, by wlop, by greg rutkowski, volumetric lighting, octane render, 4 k resolution, trending on artstation, masterpiece | hyperrealism| highly detailed| insanely detailed| intricate| cinematic lighting| depth of field
+electronik robot and ofice ,unreal engine, cozy indoor lighting, artstation, detailed, digital painting,cinematic,character design by mark ryden and pixar and hayao miyazaki, unreal 5, daz, hyperrealistic, octane render
+exquisitely intricately detailed illustration, of a small world with a lake and a rainbow, inside a closed glass jar.
@@ -1,60 +1,225 @@
+"""ComfyUI worker for the vast.ai PyWorker SDK.
+
+Each worker runs a benchmark on warm-up. The payload is selected as follows:
+
+  1. If ``misc/benchmark.json`` exists in the cloned worker tree, it is
+     used as a custom ComfyUI workflow. Use this if you fork the repo and
+     bake in your workflow.
+  2. Else, if ``$BENCHMARK_JSON_PATH`` is set and points at a readable
+     file, it is used. Use this from a provisioning script — provisioning
+     runs before pyworker is cloned, so it cannot write into ``misc/``,
+     but it can drop the workflow elsewhere (e.g. ``/workspace/``) and
+     export this env var.
+  3. Else, if the well-known path
+     ``/opt/comfyui-api-wrapper/workflows/pyworker_benchmark.json`` exists,
+     it is used. The vast.ai ComfyUI base image's ``convert-workflows.sh``
+     maintains this as a symlink to the first provisioned workflow, so on
+     that image no env var is needed.
+  4. Otherwise an SD1.5 Text2Image fallback runs, parameterised by the
+     ``BENCHMARK_TEST_{WIDTH,HEIGHT,STEPS}`` env vars and a random prompt
+     from ``misc/test_prompts.txt``.
+
+``__RANDOM_INT__`` placeholders in custom workflows are substituted
+server-side by ai-dock/comfyui-api-wrapper, so this worker does not handle
+them itself.
+"""
+
+import json
+import logging
+import os
 import random
 import sys
+from pathlib import Path

 from vastai import Worker, WorkerConfig, HandlerConfig, LogActionConfig, BenchmarkConfig

-# ComyUI model configuration
+# ComfyUI model configuration. The model server is ai-dock's
+# comfyui-api-wrapper sitting in front of ComfyUI itself, not ComfyUI's
+# own port (18188). We tail the api-wrapper's log rather than ComfyUI's
+# and key off the api-wrapper's own structured readiness/fault signals:
+#
+#   BACKENDS_READY            — api-wrapper has confirmed every ComfyUI
+#                               backend passes HTTP+WS probes. Until
+#                               this fires, posting to /generate/sync
+#                               can hit "Cannot connect to host" inside
+#                               the api-wrapper, which the SDK can't
+#                               recover from since __call_backend
+#                               doesn't retry connection-refused.
+#   BACKENDS_READY_TIMEOUT    — backends never reachable within
+#                               api-wrapper's deadline. Worker is
+#                               unrecoverable; mark errored.
+#   BACKEND_UNRECOVERABLE     — CUDA fault / illegal memory access on a
+#                               backend's GPU. Same fate.
+#   Application startup failed — uvicorn's own ASGI lifespan failed.
+#
+# These tokens are emitted by ai-dock/comfyui-api-wrapper >= the
+# "feat/backend-readiness-log-signals" change. Older wrappers won't
+# emit BACKENDS_READY, so warm-up will stall — pin the wrapper version
+# accordingly.
 MODEL_SERVER_URL           = 'http://127.0.0.1'
 MODEL_SERVER_PORT          = 18288
-MODEL_LOG_FILE             = '/var/log/portal/comfyui.log'
+MODEL_LOG_FILE             = '/var/log/portal/api-wrapper.log'
 MODEL_HEALTHCHECK_ENDPOINT = "/health"

-# ComyUI-specific log messages
+# Trigger benchmark only after the full stack (api-wrapper + ComfyUI
+# backends) is reachable. See BACKENDS_READY in the comment above.
 MODEL_LOAD_LOG_MSG = [
-    "To see the GUI go to: "
+    "BACKENDS_READY",
 ]

+# LogAction.ModelError is fatal: the SDK calls backend_errored() and
+# locks the worker into a permanent error state. Patterns must
+# therefore only match conditions where the api-wrapper genuinely
+# cannot serve any request — supervisord restarts on uvicorn exit, so
+# a real failure self-heals rather than dragging the worker down.
+#
+# Notably *not* matched here:
+#   - per-request errors (PreprocessWorker failures, ComfyUI workflow
+#     validation, "Value not in list:") — one malformed client payload
+#     would otherwise kill the worker
+#   - "CUDA out of memory" — surfaces both as a misconfigured GPU
+#     (which the benchmark-failure path already catches via
+#     backend_errored) and as a too-greedy client request, which is
+#     indistinguishable from a substring match
+#   - convert-workflows.sh warnings — that script is not load-bearing
+#     for serving
 MODEL_ERROR_LOG_MSGS = [
-    "MetadataIncompleteBuffer",
-    "Value not in list: ",
-    "[ERROR] Provisioning Script failed"
+    "BACKENDS_READY_TIMEOUT",       # backends never reachable
+    "BACKEND_UNRECOVERABLE",        # CUDA fault latched per backend
+    "Application startup failed",   # uvicorn ASGI lifespan startup failed
 ]

-MODEL_INFO_LOG_MSGS = [
-    '"message":"Downloading'
-]
+# LogAction.Info is purely informational (echoes log lines into the vast
+# console). Nothing in api-wrapper.log is currently worth surfacing —
+# model downloads are upstream in provisioning, per-request logs are
+# too noisy.
+MODEL_INFO_LOG_MSGS = []

-benchmark_prompts = [
-    "Cartoon hoodie hero; orc, anime cat, bunny; black goo; buff; vector on white.",
-    "Cozy farming-game scene with fine details.",
-    "2D vector child with soccer ball; airbrush chrome; swagger; antique copper.",
-    "Realistic futuristic downtown of low buildings at sunset.",
-    "Perfect wave front view; sunny seascape; ultra-detailed water; artful feel.",
-    "Clear cup with ice, fruit, mint; creamy swirls; fluid-sim CGI; warm glow.",
-    "Male biker with backpack on motorcycle; oilpunk; award-worthy magazine cover.",
-    "Collage for textile; surreal cartoon cat in cap/jeans before poster; crisp.",
-    "Medieval village inside glass sphere; volumetric light; macro focus.",
-    "Iron Man with glowing axe; mecha sci-fi; jungle scene; dynamic light.",
-    "Pope Francis DJ in leather jacket, mixing on giant console; dramatic.",
-]
+# Benchmark assets shipped alongside this worker. Resolved relative to this
+# file so the worker keeps working regardless of the launch cwd.
+MISC_DIR       = Path(__file__).parent / "misc"
+BENCHMARK_FILE = MISC_DIR / "benchmark.json"
+TEST_PROMPTS   = MISC_DIR / "test_prompts.txt"
+
+# Well-known location maintained by the vast.ai ComfyUI base image.
+# convert-workflows.sh symlinks this to the first provisioned workflow,
+# letting the base image work out-of-the-box without any env var.
+WELLKNOWN_BENCHMARK = Path("/opt/comfyui-api-wrapper/workflows/pyworker_benchmark.json")
+
+log = logging.getLogger(__name__)
+
+# Used when test_prompts.txt is unreadable or empty. Bare and generic
+# on purpose — this is a benchmark seed, not a creative output.
+_FALLBACK_PROMPT = "a still life on a wooden table, soft daylight"


+def _env_int(name: str, default: int) -> int:
+    """Read an integer env var, warning + falling back on bad values."""
+    raw = os.getenv(name)
+    if raw is None or raw == "":
+        return default
+    try:
+        return int(raw)
+    except ValueError:
+        log.warning("ignoring %s=%r (not an int); using default %d", name, raw, default)
+        return default

-benchmark_dataset = [
-    {
+
+def _try_load_workflow(path: Path) -> dict | None:
+    """Load and return a benchmark workflow from ``path``.
+
+    Returns None on any failure (path missing, not a regular file,
+    unreadable, invalid JSON) so the caller can fall through to the
+    next tier rather than dropping straight to the SD1.5 default.
+    """
+    if not path.is_file():
+        return None
+    try:
+        with open(path) as f:
+            return json.load(f)
+    except (json.JSONDecodeError, OSError) as e:
+        log.warning("Failed to load %s: %s; trying next tier", path, e)
+        return None
+
+
+def _custom_workflow_payload() -> dict | None:
+    """Try each benchmark workflow tier in order; return the first one
+    that loads cleanly as a payload, or None if every tier is absent /
+    unreadable. Tiers (in order): in-tree ``misc/benchmark.json``,
+    ``$BENCHMARK_JSON_PATH``, well-known base-image symlink.
+    """
+    env_path = os.getenv("BENCHMARK_JSON_PATH")
+    candidates = [("misc", BENCHMARK_FILE)]
+    if env_path:
+        candidates.append(("env", Path(env_path)))
+    candidates.append(("well-known", WELLKNOWN_BENCHMARK))
+
+    for label, path in candidates:
+        # Surface a warning specifically when the operator pointed
+        # BENCHMARK_JSON_PATH at something we can't use — silent
+        # fall-through there is a footgun (typo => SD1.5 fallback,
+        # operator wonders why custom benchmark didn't take).
+        if not path.is_file():
+            if label == "env":
+                log.warning(
+                    "BENCHMARK_JSON_PATH=%s is not a readable file; trying fallbacks", path
+                )
+            continue
+        workflow = _try_load_workflow(path)
+        if workflow is None:
+            continue
+        log.info("Using custom benchmark workflow from %s (%s)", path, label)
+        return {
+            "input": {
+                "request_id": f"test-{random.randint(1000, 99999)}",
+                "workflow_json": workflow,
+            }
+        }
+    return None
+
+
+def _load_prompts() -> list[str]:
+    """Read misc/test_prompts.txt; defensive against missing/empty file."""
+    try:
+        with open(TEST_PROMPTS) as f:
+            prompts = [line.strip() for line in f if line.strip()]
+    except OSError as e:
+        log.warning("could not read %s: %s; using built-in fallback prompt", TEST_PROMPTS, e)
+        return [_FALLBACK_PROMPT]
+    if not prompts:
+        log.warning("%s is empty; using built-in fallback prompt", TEST_PROMPTS)
+        return [_FALLBACK_PROMPT]
+    return prompts
+
+
+def _default_payload() -> dict:
+    """Build the SD1.5 Text2Image fallback payload."""
+    prompts = _load_prompts()
+    return {
        "input": {
            "request_id": f"test-{random.randint(1000, 99999)}",
            "modifier": "Text2Image",
            "modifications": {
-                "prompt": prompt,
-                "width": 512,
-                "height": 512,
-                "steps": 20,
-                "seed": random.randint(0, sys.maxsize)
+                "prompt": random.choice(prompts),
+                "width":  _env_int("BENCHMARK_TEST_WIDTH",  512),
+                "height": _env_int("BENCHMARK_TEST_HEIGHT", 512),
+                "steps":  _env_int("BENCHMARK_TEST_STEPS",  20),
+                "seed":   random.randint(0, sys.maxsize),
            }
        }
-    } for prompt in benchmark_prompts
-]
+    }
+
+
+def make_benchmark_payload() -> dict:
+    """Build one benchmark request payload.
+
+    Called once per benchmark run by the SDK; using a generator (rather
+    than a static ``dataset=``) lets each run re-pick a prompt and re-roll
+    the seed, and avoids holding multiple copies of a large workflow JSON
+    in memory.
+    """
+    return _custom_workflow_payload() or _default_payload()
+

 worker_config = WorkerConfig(
    model_server_url=MODEL_SERVER_URL,
@@ -67,7 +232,7 @@ worker_config = WorkerConfig(
            allow_parallel_requests=False,
            max_queue_time=10.0,
            benchmark_config=BenchmarkConfig(
-                dataset=benchmark_dataset,
+                generator=make_benchmark_payload,
            )
        )
    ],
@@ -5,17 +5,14 @@ import os
 from vastai import Worker, WorkerConfig, HandlerConfig, LogActionConfig, BenchmarkConfig

 # vLLM model configuration
-MODEL_SERVER_URL           = 'http://127.0.0.1:11434'
-MODEL_SERVER_PORT          = 11434
-MODEL_LOG_FILE             = '/var/log/onstart.log'
+MODEL_SERVER_URL           = 'http://127.0.0.1'
+MODEL_SERVER_PORT          = 18000
+MODEL_LOG_FILE             = '/var/log/portal/vllm.log'
 MODEL_HEALTHCHECK_ENDPOINT = "/health"

 # vLLM-specific log messages
 MODEL_LOAD_LOG_MSG = [
    "Application startup complete.",
-    "llama runner started in",
-    "Server listening on",
-    "msg=\"Listening on",
 ]

 MODEL_ERROR_LOG_MSGS = [
Author	SHA1	Message	Date
Rob Ballantyne	b52c654f09	comfyui-json: key readiness off api-wrapper's BACKENDS_READY token Rather than tailing for "Uvicorn running on", which only confirms the api-wrapper's own HTTP listener is bound, watch for the api-wrapper's new structured tokens that reflect actual end-to-end reachability: MODEL_LOAD_LOG_MSG = ["BACKENDS_READY"] MODEL_ERROR_LOG_MSGS includes: - "BACKENDS_READY_TIMEOUT" (backends never came up) - "BACKEND_UNRECOVERABLE" (CUDA fault latched on a backend) - "Application startup failed" (kept; uvicorn's own ASGI failure) Closes the race observed on a live test where the pyworker fired benchmark the moment uvicorn bound, every request inside the api-wrapper hit Cannot-connect-to-host on ComfyUI, and the SDK counted the resulting fast 502s as a fast worker (perf=200). Tokens are emitted by ai-dock/comfyui-api-wrapper#11 and onward; earlier wrapper versions won't emit BACKENDS_READY so warm-up stalls indefinitely — pin to a wrapper that includes that change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 09:46:45 +01:00
Rob Ballantyne	a5bcc3de5e	comfyui-json: address PR #85 review Five issues raised by Copilot's review: 1. _resolve_benchmark_path's docstring/README claim that a set-but- broken BENCHMARK_JSON_PATH falls through to the well-known tier, but the implementation only handled "file missing". A path pointing at a directory or holding malformed JSON dropped straight to the SD1.5 fallback without consulting tier 3. Replaced with a true tiered try-and-load: walk (misc, env, well-known), attempt to load each, and fall through to the next on any failure (missing, not a regular file, unreadable, invalid JSON). The env-var case still surfaces a warning so a typo doesn't fail silently. 2. int(os.getenv("BENCHMARK_TEST_WIDTH", ...)) crashed on non-int values. Added _env_int helper that warns + returns default on ValueError. Empty string also handled. 3. random.choice([]) on an empty test_prompts.txt raised IndexError. _load_prompts now warns + uses a built-in _FALLBACK_PROMPT when the file is missing or yields no non-blank lines. 4. README already claimed "missing or unreadable" fall-through; the refactor in (1) makes the code match. No README change needed. 5. test_prompts.txt restored verbatim from the pre-rewrite tree carried real-person and IP-laden prompts (Pope Francis, Iron Man, Luke Skywalker, "Disney socialite"). Used automatically during warm-up they're a reputational/safety-filter risk for the worker. Replaced with generic equivalents that exercise the same workload characteristics (1 elderly figure on motorcycle, 1 armoured hero with axe, etc.). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 18:25:21 +01:00
Rob Ballantyne	cecf0236fa	comfyui-json: watch api-wrapper.log for readiness Switch MODEL_LOG_FILE from /var/log/portal/comfyui.log to /var/log/portal/api-wrapper.log and MODEL_LOAD_LOG_MSG to "Uvicorn running on". A live test instance showed the previous setup firing benchmark on ComfyUI's "To see the GUI go to:" line, which races api-wrapper.sh: that script runs convert-workflows.sh (which itself waits for ComfyUI ready and then converts workflows for several seconds) before launching uvicorn. The benchmark hit a closed port on :18288 and the SDK's __call_backend has no retry on connection refused, locking the worker into a permanent error state. Watching the api-wrapper log instead means the benchmark only fires after uvicorn is bound and the pyworker_benchmark.json symlink is already in place — no SDK changes required. Trim MODEL_ERROR_LOG_MSGS down to "Application startup failed". The old patterns were ComfyUI-specific (won't appear in api-wrapper.log) and dangerous: ModelError is fatal, so "Value not in list:" matching on an api-wrapper-style log would let one malformed client request kill the worker. CUDA OOM is similarly off-limits (indistinguishable from a too-greedy client request via substring match; the benchmark- failure path already catches model-load OOM at boot). Empty MODEL_INFO_LOG_MSGS — the prior ComfyUI download pattern can never match this log file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 12:46:17 +01:00
Rob Ballantyne	09917a9c88	Revert "Wait briefly for the well-known benchmark symlink" This reverts commit `9d7371ddba`.	2026-05-07 12:03:19 +01:00
Rob Ballantyne	9d7371ddba	Wait briefly for the well-known benchmark symlink The pyworker and convert-workflows.sh both unblock when ComfyUI is ready, but conversion takes a few seconds longer — without a wait, the first benchmark loses the race and silently drops to the SD1.5 fallback. Wait up to BENCHMARK_WAIT_TIMEOUT (default 30s) for the symlink before giving up. The wait fires only when we're actually about to use the well-known tier (env var / misc/ paths short-circuit), only once per process, and is skipped entirely off the base image (parent directory absent), so non-base-image deployments don't pay the timeout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 11:59:30 +01:00
Rob Ballantyne	381a39f201	Add well-known fallback path for benchmark.json Read /opt/comfyui-api-wrapper/workflows/pyworker_benchmark.json when neither misc/benchmark.json nor $BENCHMARK_JSON_PATH yields a usable file. The vast.ai ComfyUI base image's convert-workflows.sh maintains that path as a symlink to the first provisioned workflow, so on that image the operator does not need to set BENCHMARK_JSON_PATH at all. A set-but-broken $BENCHMARK_JSON_PATH now warns and falls through to the well-known path instead of dropping straight to the SD1.5 fallback, so a typo in the env var doesn't mask an otherwise-working benchmark. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 11:54:20 +01:00
Rob Ballantyne	a634ba07a6	Support BENCHMARK_JSON_PATH for provisioning-supplied benchmarks start_server.sh clones pyworker into /workspace/vast-pyworker after the provisioning phase has run, so a provisioning script that wants to ship a custom benchmark workflow cannot write to misc/benchmark.json — that path doesn't exist yet at provisioning time, and pre-creating it would make the subsequent clone fail. Allow provisioning to drop the workflow anywhere (e.g. /workspace) and point the worker at it via the BENCHMARK_JSON_PATH env var. The in-tree file still takes precedence (so forks with a baked-in benchmark keep working unchanged); the env var is consulted only as a second choice, and a misconfigured path logs a warning rather than silently degrading to the SD1.5 fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 11:24:14 +01:00
Rob Ballantyne	2dd4f7fc38	Restore benchmark.json loading in comfyui-json worker The "Use PyWorker SDK" rewrite (`4380d98`) replaced the dynamic ComfyWorkflowData.for_test() benchmark logic with a hardcoded list of 11 SD1.5 Text2Image payloads, dropped misc/benchmark.json.example and misc/test_prompts.txt, and stopped honouring the BENCHMARK_TEST_* environment variables. The README's documented behaviour (custom workflow via benchmark.json, env-var-tuned fallback) had no implementation behind it. Restore the original two-tier behaviour against the new SDK by passing BenchmarkConfig(generator=make_benchmark_payload) instead of a static dataset, splitting the load logic into a custom-workflow path and a fallback path, and re-shipping the misc/ assets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 11:06:34 +01:00