Use PyWorker SDK (#67)

* Change PyWorker to Worker SDK * Moved /lib to vast-sdk (https://github.com/vast-ai/vast-sdk)
2025-12-15 22:33:03 -05:00
parent 2ce741a8b7
commit 4380d98c01
54 changed files with 1622 additions and 4626 deletions
@@ -0,0 +1,168 @@
+# ComfyUI ACE Step PyWorker
+
+This is the PyWorker implementation for running **ACE Step v1 3.5B** text-to-music workflows in ComfyUI. It provides a unified interface for executing complete ComfyUI audio-generation workflows through a proxy-based architecture and returning generated audio assets.
+
+Each request has a static cost of `1000`. ComfyUI does not support concurrent workloads, and there is no provision to run multiple ComfyUI instances per worker node.
+
+## Requirements
+
+This worker requires the following components:
+
+- ComfyUI (https://github.com/comfyanonymous/ComfyUI)
+- ComfyUI API Wrapper (https://github.com/ai-dock/comfyui-api-wrapper)
+- ACE Step v1 3.5B model and required custom nodes
+
+A Docker image is provided with the ACE Step model pre-installed, but any image may be used if the above requirements are met.
+
+## Endpoint
+
+The worker exposes a single synchronous endpoint:
+
+- `/generate/sync`: Processes a complete ComfyUI workflow JSON and generates audio output
+
+## Request Format
+
+The ACE Step worker **only supports custom workflow mode**. Modifier-based workflows are not supported.
+
+```json
+{
+  "input": {
+    "request_id": "uuid-string",
+    "workflow_json": {
+      // Complete ComfyUI ACE Step workflow JSON
+    },
+    "s3": { },
+    "webhook": { }
+  }
+}
+```
+
+## Request Fields
+
+### Required Fields
+
+- `input`: Container for all request parameters
+- `input.workflow_json`: Complete ComfyUI workflow graph for ACE Step audio generation
+
+### Optional Fields
+
+- `input.request_id`: Client-defined request identifier
+- `input.s3`: S3-compatible storage configuration
+- `input.webhook`: Webhook configuration for completion notifications
+
+The special string `"__RANDOM_INT__"` may be used in the workflow JSON and will be replaced with a random integer before submission to ComfyUI.
+
+## S3 Configuration
+
+Generated audio assets can be automatically uploaded to S3-compatible storage. Configuration can be supplied per request or via environment variables. Request-level values take precedence.
+
+### Via Request JSON
+
+```json
+"s3": {
+  "access_key_id": "your-s3-access-key",
+  "secret_access_key": "your-s3-secret-access-key",
+  "endpoint_url": "https://s3.amazonaws.com",
+  "bucket_name": "your-bucket",
+  "region": "us-east-1"
+}
+```
+
+### Via Environment Variables
+
+```bash
+S3_ACCESS_KEY_ID=your-key
+S3_SECRET_ACCESS_KEY=your-secret
+S3_BUCKET_NAME=your-bucket
+S3_ENDPOINT_URL=https://s3.amazonaws.com
+S3_REGION=us-east-1
+```
+
+## Webhook Configuration
+
+Webhooks are triggered on request completion or failure.
+
+### Via Request JSON
+
+```json
+"webhook": {
+  "url": "https://your-webhook-url",
+  "extra_params": {
+    "custom_field": "value"
+  }
+}
+```
+
+### Via Environment Variables
+
+```bash
+WEBHOOK_URL=https://your-webhook-url
+WEBHOOK_TIMEOUT=30
+```
+
+## Example Request
+
+### ACE Step Text-to-Music Workflow
+
+```json
+{
+  "input": {
+    "workflow_json": {
+      "14": {
+        "inputs": {
+          "tags": "funk, pop, upbeat, 105 BPM",
+          "lyrics": "Turn it up and let it flow",
+          "lyrics_strength": 0.99,
+          "clip": ["40", 1]
+        },
+        "class_type": "TextEncodeAceStepAudio"
+      },
+      "17": {
+        "inputs": {
+          "seconds": 180,
+          "batch_size": 1
+        },
+        "class_type": "EmptyAceStepLatentAudio"
+      },
+      "40": {
+        "inputs": {
+          "ckpt_name": "ace_step_v1_3.5b.safetensors"
+        },
+        "class_type": "CheckpointLoaderSimple"
+      }
+    }
+  }
+}
+```
+
+## Response Format
+
+A successful response includes execution metadata, ComfyUI output details, and generated audio assets.
+
+### Response Fields
+
+- `id`: Unique request identifier
+- `status`: `completed`, `failed`, `processing`, `generating`, or `queued`
+- `message`: Human-readable status message
+- `comfyui_response`: Raw response from ComfyUI, including execution status and progress
+- `output`: Array of generated outputs
+- `timings`: Timing information for the request
+
+### Output Object
+
+Each entry in `output` includes:
+
+- `filename`: Generated file name (e.g., `.mp3`)
+- `local_path`: File path on the worker
+- `url`: Pre-signed download URL (if S3 is configured)
+- `type`: Output type (`output`)
+- `subfolder`: Output directory (e.g., `audio`)
+- `node_id`: ComfyUI node that produced the output
+- `output_type`: Output category (e.g., `audio`)
+
+## Notes and Limitations
+
+- Only full ComfyUI workflow JSONs are supported
+- Concurrent requests are not supported per worker
+- ACE Step model must be installed before processing requests
+- Audio generation duration and runtime depend on workflow configuration