# ComfyUI ACE Step PyWorker This is the PyWorker implementation for running **ACE Step v1 3.5B** text-to-music workflows in ComfyUI. It provides a unified interface for executing complete ComfyUI audio-generation workflows through a proxy-based architecture and returning generated audio assets. Each request has a static cost of `100`. ComfyUI does not support concurrent workloads, and there is no provision to run multiple ComfyUI instances per worker node. ## Requirements This worker requires the following components: - ComfyUI (https://github.com/comfyanonymous/ComfyUI) - ComfyUI API Wrapper (https://github.com/ai-dock/comfyui-api-wrapper) - ACE Step v1 3.5B model and required custom nodes A Docker image is provided with the ACE Step model pre-installed, but any image may be used if the above requirements are met. ## Endpoint The worker exposes a single synchronous endpoint: - `/generate/sync`: Processes a complete ComfyUI workflow JSON and generates audio output ## Request Format The ACE Step worker **only supports custom workflow mode**. Modifier-based workflows are not supported. ```json { "input": { "request_id": "uuid-string", "workflow_json": { // Complete ComfyUI ACE Step workflow JSON }, "s3": { }, "webhook": { } } } ``` ## Request Fields ### Required Fields - `input`: Container for all request parameters - `input.workflow_json`: Complete ComfyUI workflow graph for ACE Step audio generation ### Optional Fields - `input.request_id`: Client-defined request identifier - `input.s3`: S3-compatible storage configuration - `input.webhook`: Webhook configuration for completion notifications The special string `"__RANDOM_INT__"` may be used in the workflow JSON and will be replaced with a random integer before submission to ComfyUI. ## S3 Configuration Generated audio assets can be automatically uploaded to S3-compatible storage. Configuration can be supplied per request or via environment variables. Request-level values take precedence. ### Via Request JSON ```json "s3": { "access_key_id": "your-s3-access-key", "secret_access_key": "your-s3-secret-access-key", "endpoint_url": "https://s3.amazonaws.com", "bucket_name": "your-bucket", "region": "us-east-1" } ``` ### Via Environment Variables ```bash S3_ACCESS_KEY_ID=your-key S3_SECRET_ACCESS_KEY=your-secret S3_BUCKET_NAME=your-bucket S3_ENDPOINT_URL=https://s3.amazonaws.com S3_REGION=us-east-1 ``` ## Webhook Configuration Webhooks are triggered on request completion or failure. ### Via Request JSON ```json "webhook": { "url": "https://your-webhook-url", "extra_params": { "custom_field": "value" } } ``` ### Via Environment Variables ```bash WEBHOOK_URL=https://your-webhook-url WEBHOOK_TIMEOUT=30 ``` ## Example Request ### ACE Step Text-to-Music Workflow ```json { "input": { "workflow_json": { "14": { "inputs": { "tags": "funk, pop, upbeat, 105 BPM", "lyrics": "Turn it up and let it flow", "lyrics_strength": 0.99, "clip": ["40", 1] }, "class_type": "TextEncodeAceStepAudio" }, "17": { "inputs": { "seconds": 180, "batch_size": 1 }, "class_type": "EmptyAceStepLatentAudio" }, "40": { "inputs": { "ckpt_name": "ace_step_v1_3.5b.safetensors" }, "class_type": "CheckpointLoaderSimple" } } } } ``` ## Response Format A successful response includes execution metadata, ComfyUI output details, and generated audio assets. ### Response Fields - `id`: Unique request identifier - `status`: `completed`, `failed`, `processing`, `generating`, or `queued` - `message`: Human-readable status message - `comfyui_response`: Raw response from ComfyUI, including execution status and progress - `output`: Array of generated outputs - `timings`: Timing information for the request ### Output Object Each entry in `output` includes: - `filename`: Generated file name (e.g., `.mp3`) - `local_path`: File path on the worker - `url`: Pre-signed download URL (if S3 is configured) - `type`: Output type (`output`) - `subfolder`: Output directory (e.g., `audio`) - `node_id`: ComfyUI node that produced the output - `output_type`: Output category (e.g., `audio`) ## Notes and Limitations - Only full ComfyUI workflow JSONs are supported - Concurrent requests are not supported per worker - ACE Step model must be installed before processing requests - Audio generation duration and runtime depend on workflow configuration