Added clients, updated READMEs

2025-12-12 10:41:21 -08:00
parent 6060f8ce0c
commit 4d99c12820
9 changed files with 827 additions and 199 deletions
@@ -0,0 +1,168 @@
+# ComfyUI ACE Step PyWorker
+
+This is the PyWorker implementation for running **ACE Step v1 3.5B** text-to-music workflows in ComfyUI. It provides a unified interface for executing complete ComfyUI audio-generation workflows through a proxy-based architecture and returning generated audio assets.
+
+Each request has a static cost of `100`. ComfyUI does not support concurrent workloads, and there is no provision to run multiple ComfyUI instances per worker node.
+
+## Requirements
+
+This worker requires the following components:
+
+- ComfyUI (https://github.com/comfyanonymous/ComfyUI)
+- ComfyUI API Wrapper (https://github.com/ai-dock/comfyui-api-wrapper)
+- ACE Step v1 3.5B model and required custom nodes
+
+A Docker image is provided with the ACE Step model pre-installed, but any image may be used if the above requirements are met.
+
+## Endpoint
+
+The worker exposes a single synchronous endpoint:
+
+- `/generate/sync`: Processes a complete ComfyUI workflow JSON and generates audio output
+
+## Request Format
+
+The ACE Step worker **only supports custom workflow mode**. Modifier-based workflows are not supported.
+
+```json
+{
+  "input": {
+    "request_id": "uuid-string",
+    "workflow_json": {
+      // Complete ComfyUI ACE Step workflow JSON
+    },
+    "s3": { },
+    "webhook": { }
+  }
+}
+```
+
+## Request Fields
+
+### Required Fields
+
+- `input`: Container for all request parameters
+- `input.workflow_json`: Complete ComfyUI workflow graph for ACE Step audio generation
+
+### Optional Fields
+
+- `input.request_id`: Client-defined request identifier
+- `input.s3`: S3-compatible storage configuration
+- `input.webhook`: Webhook configuration for completion notifications
+
+The special string `"__RANDOM_INT__"` may be used in the workflow JSON and will be replaced with a random integer before submission to ComfyUI.
+
+## S3 Configuration
+
+Generated audio assets can be automatically uploaded to S3-compatible storage. Configuration can be supplied per request or via environment variables. Request-level values take precedence.
+
+### Via Request JSON
+
+```json
+"s3": {
+  "access_key_id": "your-s3-access-key",
+  "secret_access_key": "your-s3-secret-access-key",
+  "endpoint_url": "https://s3.amazonaws.com",
+  "bucket_name": "your-bucket",
+  "region": "us-east-1"
+}
+```
+
+### Via Environment Variables
+
+```bash
+S3_ACCESS_KEY_ID=your-key
+S3_SECRET_ACCESS_KEY=your-secret
+S3_BUCKET_NAME=your-bucket
+S3_ENDPOINT_URL=https://s3.amazonaws.com
+S3_REGION=us-east-1
+```
+
+## Webhook Configuration
+
+Webhooks are triggered on request completion or failure.
+
+### Via Request JSON
+
+```json
+"webhook": {
+  "url": "https://your-webhook-url",
+  "extra_params": {
+    "custom_field": "value"
+  }
+}
+```
+
+### Via Environment Variables
+
+```bash
+WEBHOOK_URL=https://your-webhook-url
+WEBHOOK_TIMEOUT=30
+```
+
+## Example Request
+
+### ACE Step Text-to-Music Workflow
+
+```json
+{
+  "input": {
+    "workflow_json": {
+      "14": {
+        "inputs": {
+          "tags": "funk, pop, upbeat, 105 BPM",
+          "lyrics": "Turn it up and let it flow",
+          "lyrics_strength": 0.99,
+          "clip": ["40", 1]
+        },
+        "class_type": "TextEncodeAceStepAudio"
+      },
+      "17": {
+        "inputs": {
+          "seconds": 180,
+          "batch_size": 1
+        },
+        "class_type": "EmptyAceStepLatentAudio"
+      },
+      "40": {
+        "inputs": {
+          "ckpt_name": "ace_step_v1_3.5b.safetensors"
+        },
+        "class_type": "CheckpointLoaderSimple"
+      }
+    }
+  }
+}
+```
+
+## Response Format
+
+A successful response includes execution metadata, ComfyUI output details, and generated audio assets.
+
+### Response Fields
+
+- `id`: Unique request identifier
+- `status`: `completed`, `failed`, `processing`, `generating`, or `queued`
+- `message`: Human-readable status message
+- `comfyui_response`: Raw response from ComfyUI, including execution status and progress
+- `output`: Array of generated outputs
+- `timings`: Timing information for the request
+
+### Output Object
+
+Each entry in `output` includes:
+
+- `filename`: Generated file name (e.g., `.mp3`)
+- `local_path`: File path on the worker
+- `url`: Pre-signed download URL (if S3 is configured)
+- `type`: Output type (`output`)
+- `subfolder`: Output directory (e.g., `audio`)
+- `node_id`: ComfyUI node that produced the output
+- `output_type`: Output category (e.g., `audio`)
+
+## Notes and Limitations
+
+- Only full ComfyUI workflow JSONs are supported
+- Concurrent requests are not supported per worker
+- ACE Step model must be installed before processing requests
+- Audio generation duration and runtime depend on workflow configuration
@@ -0,0 +1,149 @@
+from vastai import Serverless
+import asyncio
+
+
+async def main():
+    async with Serverless() as client:
+        endpoint = await client.get_endpoint(name="my-ace-endpoint")
+
+        # ComfyUI API compatible json workflow for ACE Step
+        workflow = {
+          "14": {
+            "inputs": {
+              "tags": "funk, pop, soul, rock, melodic, guitar, drums, bass, keyboard, percussion, 105 BPM, energetic, upbeat, groovy, vibrant, dynamic",
+              "lyrics": "[verse]\nNeon lights they flicker bright\nCity hums in dead of night\nRhythms pulse through concrete veins\nLost in echoes of refrains\n\n[verse]\nBassline groovin in my chest\nHeartbeats match the citys zest\nElectric whispers fill the air\nSynthesized dreams everywhere\n\n[chorus]\nTurn it up and let it flow\nFeel the fire let it grow\nIn this rhythm we belong\nHear the night sing out our song",
+              "lyrics_strength": 0.99,
+              "clip": ["40", 1]
+            },
+            "class_type": "TextEncodeAceStepAudio",
+            "_meta": {
+              "title": "TextEncodeAceStepAudio"
+            }
+          },
+          "17": {
+            "inputs": {
+              "seconds": 180,
+              "batch_size": 1
+            },
+            "class_type": "EmptyAceStepLatentAudio",
+            "_meta": {
+              "title": "EmptyAceStepLatentAudio"
+            }
+          },
+          "18": {
+            "inputs": {
+              "samples": ["52", 0],
+              "vae": ["40", 2]
+            },
+            "class_type": "VAEDecodeAudio",
+            "_meta": {
+              "title": "VAE Decode Audio"
+            }
+          },
+          "40": {
+            "inputs": {
+              "ckpt_name": "ace_step_v1_3.5b.safetensors"
+            },
+            "class_type": "CheckpointLoaderSimple",
+            "_meta": {
+              "title": "Load Checkpoint"
+            }
+          },
+          "44": {
+            "inputs": {
+              "conditioning": ["14", 0]
+            },
+            "class_type": "ConditioningZeroOut",
+            "_meta": {
+              "title": "ConditioningZeroOut"
+            }
+          },
+          "49": {
+            "inputs": {
+              "model": ["51", 0],
+              "operation": ["50", 0]
+            },
+            "class_type": "LatentApplyOperationCFG",
+            "_meta": {
+              "title": "LatentApplyOperationCFG"
+            }
+          },
+          "50": {
+            "inputs": {
+              "multiplier": 1.15
+            },
+            "class_type": "LatentOperationTonemapReinhard",
+            "_meta": {
+              "title": "LatentOperationTonemapReinhard"
+            }
+          },
+          "51": {
+            "inputs": {
+              "shift": 6,
+              "model": ["40", 0]
+            },
+            "class_type": "ModelSamplingSD3",
+            "_meta": {
+              "title": "ModelSamplingSD3"
+            }
+          },
+          "52": {
+            "inputs": {
+              "seed": "__RANDOM_INT__",
+              "steps": 65,
+              "cfg": 4,
+              "sampler_name": "er_sde",
+              "scheduler": "linear_quadratic",
+              "denoise": 1,
+              "model": ["49", 0],
+              "positive": ["14", 0],
+              "negative": ["44", 0],
+              "latent_image": ["17", 0]
+            },
+            "class_type": "KSampler",
+            "_meta": {
+              "title": "KSampler"
+            }
+          },
+          "59": {
+            "inputs": {
+              "filename_prefix": "audio/ComfyUI",
+              "quality": "V0",
+              "audioUI": "",
+              "audio": ["18", 0]
+            },
+            "class_type": "SaveAudioMP3",
+            "_meta": {
+              "title": "Save Audio (MP3)"
+            }
+          }
+        }
+
+        payload = {
+          "input": {
+            "request_id": "",
+            "workflow_json": workflow,
+            "s3": {
+              "access_key_id": "",
+              "secret_access_key": "",
+              "endpoint_url": "",
+              "bucket_name": "",
+              "region": ""
+            },
+            "webhook": {
+              "url": "",
+              "extra_params": {
+                "user_id": "12345",
+                "project_id": "abc-def"
+              }
+            }
+          }
+        }
+
+        response = await endpoint.request("/generate/sync", payload)
+
+        # Response contains status, output, and any errors
+        print(response["response"])
+
+if __name__ == "__main__":
+    asyncio.run(main())