From 05419889914c99cd0253a2d16a1f6e236216512a Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 25 Mar 2025 21:25:05 +0000 Subject: [PATCH 01/12] build(deps): bump transformers in the pip group across 1 directory Bumps the pip group with 1 update in the / directory: [transformers](https://github.com/huggingface/transformers). Updates `transformers` from 4.43.2 to 4.48.0 - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.43.2...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 0f398b8..bceebdf 100644 --- a/requirements.txt +++ b/requirements.txt @@ -42,7 +42,7 @@ tiktoken==0.7.0 token-count==0.2.1 tokenizers==0.19.1 tqdm==4.66.4 -transformers==4.43.2 +transformers==4.48.0 typing_extensions==4.12.2 urllib3==2.2.2 Werkzeug==3.0.3 From 22c19f23b633be9a34a2ebaf7c084355efa80052 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 25 Mar 2025 21:25:15 +0000 Subject: [PATCH 02/12] build(deps): bump aiohttp in the pip group across 1 directory Bumps the pip group with 1 update in the / directory: [aiohttp](https://github.com/aio-libs/aiohttp). Updates `aiohttp` from 3.10.0 to 3.11.0b0 - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.11.0b0/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.0...v3.11.0b0) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 0f398b8..8a01103 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,6 +1,6 @@ aiofiles==24.1.0 aiohappyeyeballs==2.3.4 -aiohttp==3.10.0 +aiohttp==3.11.0b0 aiojobs==1.2.1 aiosignal==1.3.1 anyio==4.4.0 From a153b0bcc843f4feab82c616188ce8972cee3e44 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Tue, 25 Mar 2025 14:44:59 -0700 Subject: [PATCH 03/12] removing some deps --- LICENSE => LICENSE.MIT | 0 requirements.txt | 2 -- 2 files changed, 2 deletions(-) rename LICENSE => LICENSE.MIT (100%) diff --git a/LICENSE b/LICENSE.MIT similarity index 100% rename from LICENSE rename to LICENSE.MIT diff --git a/requirements.txt b/requirements.txt index 0f398b8..446972e 100644 --- a/requirements.txt +++ b/requirements.txt @@ -15,12 +15,10 @@ Flask==3.0.3 frozenlist==1.4.1 fsspec==2024.6.1 gitignore_parser==0.1.11 -gunicorn==22.0.0 hf_transfer==0.1.8 huggingface-hub==0.24.2 idna==3.7 itsdangerous==2.2.0 -Jinja2==3.1.4 joblib==1.4.2 MarkupSafe==2.1.5 multidict==6.0.5 From eed0a6c8bcf64738d2f6d15079e66f43dfdf42dc Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Tue, 25 Mar 2025 14:46:30 -0700 Subject: [PATCH 04/12] removing some deps --- requirements.txt | 1 - 1 file changed, 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 446972e..c1db702 100644 --- a/requirements.txt +++ b/requirements.txt @@ -43,7 +43,6 @@ tqdm==4.66.4 transformers==4.43.2 typing_extensions==4.12.2 urllib3==2.2.2 -Werkzeug==3.0.3 wheel==0.43.0 yarl==1.9.4 zstandard==0.22.0 From 74aa9b6b6772888a35706fd719cb08c686f36ad7 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Wed, 26 Mar 2025 14:34:39 -0700 Subject: [PATCH 05/12] style --- README.md | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 7a4d10a..7da46ff 100644 --- a/README.md +++ b/README.md @@ -7,8 +7,86 @@ same instance. Additionally, it monitors performance metrics and estimates curre such as the number of tokens processed for LLMs or image resolution and steps for image generation models, reporting these metrics to the autoscaler. +## Project Structure + +* `lib/`: Contains the core PyWorker framework code (server logic, data types, metrics). +* `workers/`: Contains specific implementations (PyWorkers) for different model servers. Each subdirectory represents a worker for a particular model type. + +## Getting Started + +1. **Install Dependencies:** + ```bash + pip install -r requirements.txt + ``` + You may also need `pyright` for type checking: + ```bash + sudo npm install -g pyright + # or use your preferred method to install pyright + ``` + +2. **Configure Environment:** Set any necessary environment variables (e.g., `MODEL_LOG` path, API keys if needed by your worker). + +3. **Run the Server:** Use the provided script. You'll need to specify which worker to run. + ```bash + # Example for hello_world worker (assuming MODEL_LOG is set) + ./start_server.sh workers.hello_world.server + ``` + Replace `workers.hello_world.server` with the path to the `server.py` module of the worker you want to run. + ## How to Use -If you want to use autoscaler, you just need to use one of Vast's autoscaler templates. If you'd like to -implement PyWorker for a template that is not marked as autoscaler compatible on Vast, refer to -`workers/hello_world/README.md` +### Using Existing Workers + +If you are using a Vast.ai template that includes PyWorker integration (marked as autoscaler compatible), it should work out of the box. The template will typically start the appropriate PyWorker server automatically. Here's a few: + +* **TGI (Text Generation Inference):** [Vast.ai Template](https://cloud.vast.ai?ref_id=140778&template_id=72d8dcb41ea3a58e06c741e2c725bc00) +* **ComfyUI:** [Vast.ai Template](https://cloud.vast.ai?ref_id=140778&template_id=ad72c8bf7cf695c3c9ddf0eaf6da0447) + +Currently available workers: +* `hello_world`: A simple example worker for a basic LLM server. +* `comfyui`: A worker for the ComfyUI image generation backend. +* `tgi`: A worker for the Text Generation Inference backend. + +### Implementing a New Worker + +To integrate PyWorker with a model server not already supported, you need to create a new worker implementation under the `workers/` directory. Follow these general steps: + +1. **Create Worker Directory:** Add a new directory under `workers/` (e.g., `workers/my_model/`). +2. **Define Data Types (`data_types.py`):** + * Create a class inheriting from `lib.data_types.ApiPayload`. + * Implement methods like `for_test`, `generate_payload_json`, `count_workload`, and `from_json_msg` to handle request data, testing, and workload calculation specific to your model's API. +3. **Implement Endpoint Handlers (`server.py`):** + * For each model API endpoint you want PyWorker to proxy, create a class inheriting from `lib.data_types.EndpointHandler`. + * Implement methods like `endpoint`, `payload_cls`, `generate_payload_json`, `make_benchmark_payload` (for one handler), and `generate_client_response`. + * Instantiate `lib.backend.Backend` with your model server details, log file path, benchmark handler, and log actions. + * Define `aiohttp` routes, mapping paths to your handlers using `backend.create_handler()`. + * Use `lib.server.start_server` to run the application. +4. **Add `__init__.py`:** Create an empty `__init__.py` file in your worker directory. +5. **(Optional) Add Load Testing (`test_load.py`):** Create a script using `lib.test_harness.run` to test your worker against a Vast.ai endpoint group. +6. **(Optional) Add Client Example (`client.py`):** Provide a script demonstrating how to call your worker's endpoints. + +**For a detailed walkthrough, refer to the `hello_world` example:** [workers/hello_world/README.md](workers/hello_world/README.md) + +**For more complex examples, see:** +* [ComfyUI Worker](workers/comfyui/README.md) +* [TGI Worker](workers/tgi/README.md) + +**Type Hinting:** It is strongly recommended to use strict type hinting throughout your implementation. Use `pyright` to check for type errors. + +## Testing Your Worker + +If you implement a `test_load.py` script for your worker, you can use it to load test a Vast.ai endpoint group running your instance image. + +```bash +# Example for hello_world worker +python3 -m workers.hello_world.test_load -n 1000 -rps 0.5 -k "$API_KEY" -e "$ENDPOINT_GROUP_NAME" +``` + +Replace `workers.hello_world.test_load` with the path to your worker's test script and provide your Vast.ai API Key (`-k`) and the target Endpoint Group Name (`-e`). Adjust the number of requests (`-n`) and requests per second (`-rps`) as needed. + +## Community & Support + +Join the conversation and get help: + +* **Vast.ai Discord:** [https://discord.gg/Pa9M29FFye](https://discord.gg/Pa9M29FFye) +* **Vast.ai Subreddit:** [https://reddit.com/r/vastai/](https://reddit.com/r/vastai/) From 4e12955dd85f3cf1aa3cd04e347e9c9f90caaa56 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Wed, 26 Mar 2025 14:54:15 -0700 Subject: [PATCH 06/12] updating the readme --- LICENSE => LICENSE.MIT | 0 README.md | 81 ++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 78 insertions(+), 3 deletions(-) rename LICENSE => LICENSE.MIT (100%) diff --git a/LICENSE b/LICENSE.MIT similarity index 100% rename from LICENSE rename to LICENSE.MIT diff --git a/README.md b/README.md index 7a4d10a..117600d 100644 --- a/README.md +++ b/README.md @@ -7,8 +7,83 @@ same instance. Additionally, it monitors performance metrics and estimates curre such as the number of tokens processed for LLMs or image resolution and steps for image generation models, reporting these metrics to the autoscaler. +## Project Structure + +* `lib/`: Contains the core PyWorker framework code (server logic, data types, metrics). +* `workers/`: Contains specific implementations (PyWorkers) for different model servers. Each subdirectory represents a worker for a particular model type. + +## Getting Started + +1. **Install Dependencies:** + ```bash + pip install -r requirements.txt + ``` + You may also need `pyright` for type checking: + ```bash + sudo npm install -g pyright + # or use your preferred method to install pyright + ``` + +2. **Configure Environment:** Set any necessary environment variables (e.g., `MODEL_LOG` path, API keys if needed by your worker). + +3. **Run the Server:** Use the provided script. You'll need to specify which worker to run. + ```bash + # Example for hello_world worker (assuming MODEL_LOG is set) + ./start_server.sh workers.hello_world.server + ``` + Replace `workers.hello_world.server` with the path to the `server.py` module of the worker you want to run. + ## How to Use -If you want to use autoscaler, you just need to use one of Vast's autoscaler templates. If you'd like to -implement PyWorker for a template that is not marked as autoscaler compatible on Vast, refer to -`workers/hello_world/README.md` +### Using Existing Workers + +If you are using a Vast.ai template that includes PyWorker integration (marked as autoscaler compatible), it should work out of the box. The template will typically start the appropriate PyWorker server automatically. Here's a few: + +* **TGI (Text Generation Inference):** [Vast.ai Template](https://cloud.vast.ai?ref_id=140778&template_id=72d8dcb41ea3a58e06c741e2c725bc00) +* **ComfyUI:** [Vast.ai Template](https://cloud.vast.ai?ref_id=140778&template_id=ad72c8bf7cf695c3c9ddf0eaf6da0447) + +Currently available workers: +* `hello_world`: A simple example worker for a basic LLM server. +* `comfyui`: A worker for the ComfyUI image generation backend. +* `tgi`: A worker for the Text Generation Inference backend. + +### Implementing a New Worker + +To integrate PyWorker with a model server not already supported, you need to create a new worker implementation under the `workers/` directory. Follow these general steps: + +1. **Create Worker Directory:** Add a new directory under `workers/` (e.g., `workers/my_model/`). +2. **Define Data Types (`data_types.py`):** + * Create a class inheriting from `lib.data_types.ApiPayload`. + * Implement methods like `for_test`, `generate_payload_json`, `count_workload`, and `from_json_msg` to handle request data, testing, and workload calculation specific to your model's API. +3. **Implement Endpoint Handlers (`server.py`):** + * For each model API endpoint you want PyWorker to proxy, create a class inheriting from `lib.data_types.EndpointHandler`. + * Implement methods like `endpoint`, `payload_cls`, `generate_payload_json`, `make_benchmark_payload` (for one handler), and `generate_client_response`. + * Instantiate `lib.backend.Backend` with your model server details, log file path, benchmark handler, and log actions. + * Define `aiohttp` routes, mapping paths to your handlers using `backend.create_handler()`. + * Use `lib.server.start_server` to run the application. +4. **Add `__init__.py`:** Create an empty `__init__.py` file in your worker directory. +5. **(Optional) Add Load Testing (`test_load.py`):** Create a script using `lib.test_harness.run` to test your worker against a Vast.ai endpoint group. +6. **(Optional) Add Client Example (`client.py`):** Provide a script demonstrating how to call your worker's endpoints. + +**For a detailed walkthrough, refer to the `hello_world` example:** [workers/hello_world/README.md](workers/hello_world/README.md) + + +**Type Hinting:** It is strongly recommended to use strict type hinting throughout your implementation. Use `pyright` to check for type errors. + +## Testing Your Worker + +If you implement a `test_load.py` script for your worker, you can use it to load test a Vast.ai endpoint group running your instance image. + +```bash +# Example for hello_world worker +python3 -m workers.hello_world.test_load -n 1000 -rps 0.5 -k "$API_KEY" -e "$ENDPOINT_GROUP_NAME" +``` + +Replace `workers.hello_world.test_load` with the path to your worker's test script and provide your Vast.ai API Key (`-k`) and the target Endpoint Group Name (`-e`). Adjust the number of requests (`-n`) and requests per second (`-rps`) as needed. + +## Community & Support + +Join the conversation and get help: + +* **Vast.ai Discord:** [https://discord.gg/Pa9M29FFye](https://discord.gg/Pa9M29FFye) +* **Vast.ai Subreddit:** [https://reddit.com/r/vastai/](https://reddit.com/r/vastai/) From 728005d28c14f7f39b360564c2d44bcf7d063106 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Wed, 26 Mar 2025 18:55:35 -0700 Subject: [PATCH 07/12] readme updates --- README.md | 3 --- workers/hello_world/README.md | 40 +++++++++++++++++++++++++++-------- 2 files changed, 31 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 7da46ff..117600d 100644 --- a/README.md +++ b/README.md @@ -67,9 +67,6 @@ To integrate PyWorker with a model server not already supported, you need to cre **For a detailed walkthrough, refer to the `hello_world` example:** [workers/hello_world/README.md](workers/hello_world/README.md) -**For more complex examples, see:** -* [ComfyUI Worker](workers/comfyui/README.md) -* [TGI Worker](workers/tgi/README.md) **Type Hinting:** It is strongly recommended to use strict type hinting throughout your implementation. Use `pyright` to check for type errors. diff --git a/workers/hello_world/README.md b/workers/hello_world/README.md index ca23d2d..595b94f 100644 --- a/workers/hello_world/README.md +++ b/workers/hello_world/README.md @@ -2,7 +2,7 @@ ## Hello_world example -There is a hello_world PyWorker implantation under `workers/hello_world`. This PyWorker is +There is a hello_world PyWorker implementation under `workers/hello_world`. This PyWorker is created for an LLM model server that runs on port 5001 has two API endpoints: 1. `/generate`: generates an full response to the prompt and sends a JSON response @@ -40,10 +40,17 @@ This will allow your IDE or VSCode with `pyright` plugin to find any type errors You can also install `pyright` with `sudo npm install -g pyright` and run `pyright` in the root of the project to find any type errors. -#### data_Types.py +#### data_types.py: Contains data types representing model API endpoints + +This file defines the structure of the data your model server expects (its API contract) and, critically, how PyWorker *interprets* that data for autoscaling purposes. You define Python data classes that mirror the JSON payloads your model's API uses. + +These classes **must** inherit from `lib.data_types.ApiPayload`. This inheritance is not just for structure; it's how PyWorker knows how to: + +* **Parse Incoming Requests:** Convert JSON from clients into usable Python objects. +* **Calculate Workload:** Determine the computational cost of a request. +* **Generate Test Data:** Create realistic inputs for benchmarking. +* **Format Requests for the Model Server:** Prepare data for the underlying model. -data classes representing the model API are defined here. They must inherit from -`lib.data_types.ApiPayload`. `ApiPayload` is an abstract class and you need to define several functions for it: ```python import dataclasses @@ -105,12 +112,27 @@ class InputData(ApiPayload): ``` -#### server.py +#### server.py: Creating Your Model's API Endpoints -For every model API endpoint you want to use, you must implement an `EndpointHandler`. This class handles incoming -requests, processes them, sends them to the model API server, and finally returns an HTTP response. -`EndpointHandler` has several abstract functions that must be implemented. Here, we implement two, one -for `/generate`, and one for `/generate_stream`: +This section guides you through creating the core of your custom model API: the `EndpointHandler`. Think of `EndpointHandler` as the bridge between incoming requests from users and your underlying model. It's the key to making your model accessible and scalable. + +**Why use an `EndpointHandler`?** + +* **Organized Request Handling:** It provides a structured way to handle different types of requests (like generating text, generating images, or performing other model-specific tasks). +* **Scalability:** By separating request handling from the model itself, you can easily scale your API to handle many concurrent users. +* **Flexibility:** You can customize how requests are processed, validated, and transformed before being sent to your model. +* **Standard Interface:** It provides a consistent interface for interacting with your model, regardless of the underlying implementation. + +For every model API endpoint you want to expose (e.g., `/generate`, `/generate_stream`), you'll implement an `EndpointHandler`. This class is responsible for: +The `EndpointHandler` achieves this through several key methods: + +* **Receiving and validating incoming requests (`get_data_from_request`):** This method ensures the request contains the necessary data (authentication and payload) and is in the correct format. It's the entry point for all requests. +* **Defining the endpoint (`endpoint`):** This method specifies the URL endpoint on the model API server where requests will be sent (e.g., `/generate`). +* **Specifying the payload type (`payload_cls`):** This method indicates the specific `ApiPayload` class used for this endpoint, defining the structure of the request data. +* **Creating benchmark payloads (`make_benchmark_payload`):** This method creates payloads specifically for benchmarking the model's performance. +* **Handling the model's response (`generate_client_response`):** This method takes the response from the model API server and transforms it into the format expected by the client making the request to your PyWorker. This allows you to customize the output as needed. + +The `EndpointHandler` class has several abstract functions that you *must* implement to define the behavior of your specific endpoints. Here, we'll implement two common endpoints: `/generate` (for synchronous requests) and `/generate_stream` (for streaming responses): ```python From e1ed9a8e62e3f79aa5c8d5f3ad456b809520a369 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Wed, 26 Mar 2025 18:58:09 -0700 Subject: [PATCH 08/12] updating the readme --- workers/hello_world/README.md | 40 +++++++++++++++++++++++++++-------- 1 file changed, 31 insertions(+), 9 deletions(-) diff --git a/workers/hello_world/README.md b/workers/hello_world/README.md index ca23d2d..595b94f 100644 --- a/workers/hello_world/README.md +++ b/workers/hello_world/README.md @@ -2,7 +2,7 @@ ## Hello_world example -There is a hello_world PyWorker implantation under `workers/hello_world`. This PyWorker is +There is a hello_world PyWorker implementation under `workers/hello_world`. This PyWorker is created for an LLM model server that runs on port 5001 has two API endpoints: 1. `/generate`: generates an full response to the prompt and sends a JSON response @@ -40,10 +40,17 @@ This will allow your IDE or VSCode with `pyright` plugin to find any type errors You can also install `pyright` with `sudo npm install -g pyright` and run `pyright` in the root of the project to find any type errors. -#### data_Types.py +#### data_types.py: Contains data types representing model API endpoints + +This file defines the structure of the data your model server expects (its API contract) and, critically, how PyWorker *interprets* that data for autoscaling purposes. You define Python data classes that mirror the JSON payloads your model's API uses. + +These classes **must** inherit from `lib.data_types.ApiPayload`. This inheritance is not just for structure; it's how PyWorker knows how to: + +* **Parse Incoming Requests:** Convert JSON from clients into usable Python objects. +* **Calculate Workload:** Determine the computational cost of a request. +* **Generate Test Data:** Create realistic inputs for benchmarking. +* **Format Requests for the Model Server:** Prepare data for the underlying model. -data classes representing the model API are defined here. They must inherit from -`lib.data_types.ApiPayload`. `ApiPayload` is an abstract class and you need to define several functions for it: ```python import dataclasses @@ -105,12 +112,27 @@ class InputData(ApiPayload): ``` -#### server.py +#### server.py: Creating Your Model's API Endpoints -For every model API endpoint you want to use, you must implement an `EndpointHandler`. This class handles incoming -requests, processes them, sends them to the model API server, and finally returns an HTTP response. -`EndpointHandler` has several abstract functions that must be implemented. Here, we implement two, one -for `/generate`, and one for `/generate_stream`: +This section guides you through creating the core of your custom model API: the `EndpointHandler`. Think of `EndpointHandler` as the bridge between incoming requests from users and your underlying model. It's the key to making your model accessible and scalable. + +**Why use an `EndpointHandler`?** + +* **Organized Request Handling:** It provides a structured way to handle different types of requests (like generating text, generating images, or performing other model-specific tasks). +* **Scalability:** By separating request handling from the model itself, you can easily scale your API to handle many concurrent users. +* **Flexibility:** You can customize how requests are processed, validated, and transformed before being sent to your model. +* **Standard Interface:** It provides a consistent interface for interacting with your model, regardless of the underlying implementation. + +For every model API endpoint you want to expose (e.g., `/generate`, `/generate_stream`), you'll implement an `EndpointHandler`. This class is responsible for: +The `EndpointHandler` achieves this through several key methods: + +* **Receiving and validating incoming requests (`get_data_from_request`):** This method ensures the request contains the necessary data (authentication and payload) and is in the correct format. It's the entry point for all requests. +* **Defining the endpoint (`endpoint`):** This method specifies the URL endpoint on the model API server where requests will be sent (e.g., `/generate`). +* **Specifying the payload type (`payload_cls`):** This method indicates the specific `ApiPayload` class used for this endpoint, defining the structure of the request data. +* **Creating benchmark payloads (`make_benchmark_payload`):** This method creates payloads specifically for benchmarking the model's performance. +* **Handling the model's response (`generate_client_response`):** This method takes the response from the model API server and transforms it into the format expected by the client making the request to your PyWorker. This allows you to customize the output as needed. + +The `EndpointHandler` class has several abstract functions that you *must* implement to define the behavior of your specific endpoints. Here, we'll implement two common endpoints: `/generate` (for synchronous requests) and `/generate_stream` (for streaming responses): ```python From f21607d2d4edb64a3a3427789e08e9727f8c4349 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Thu, 27 Mar 2025 11:04:41 -0700 Subject: [PATCH 09/12] pushing forward nltk --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index c1db702..7634132 100644 --- a/requirements.txt +++ b/requirements.txt @@ -22,7 +22,7 @@ itsdangerous==2.2.0 joblib==1.4.2 MarkupSafe==2.1.5 multidict==6.0.5 -nltk==3.8.1 +nltk==3.9.0 Nuitka==2.3.11 numpy==2.0.0 ordered-set==4.1.0 From 25ed16033c12c1e846042fc4ad951ca556ed0000 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Thu, 27 Mar 2025 11:06:14 -0700 Subject: [PATCH 10/12] client needs to be added, even if it's empty, to match the docs --- workers/hello_world/client.py | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 workers/hello_world/client.py diff --git a/workers/hello_world/client.py b/workers/hello_world/client.py new file mode 100644 index 0000000..e69de29 From 7018ef9249181dd2a05246d9a9a0ca18619183b0 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Thu, 27 Mar 2025 11:18:25 -0700 Subject: [PATCH 11/12] requirements update --- requirements.txt | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/requirements.txt b/requirements.txt index 1f47d64..3b18bf1 100644 --- a/requirements.txt +++ b/requirements.txt @@ -22,7 +22,7 @@ itsdangerous==2.2.0 joblib==1.4.2 MarkupSafe==2.1.5 multidict==6.0.5 -nltk==3.8.1 +nltk==3.9.1 Nuitka==2.3.11 numpy==2.0.0 ordered-set==4.1.0 @@ -40,9 +40,8 @@ tiktoken==0.7.0 token-count==0.2.1 tokenizers==0.19.1 tqdm==4.66.4 -transformers==4.48.0 +transformers==4.* typing_extensions==4.12.2 urllib3==2.2.2 wheel==0.43.0 -yarl==1.9.4 zstandard==0.22.0 From 72572ea39d0025a0b9d331c1fc5a3c4d22355e09 Mon Sep 17 00:00:00 2001 From: Chris McKenzie Date: Thu, 27 Mar 2025 11:18:32 -0700 Subject: [PATCH 12/12] readme update --- workers/hello_world/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/workers/hello_world/README.md b/workers/hello_world/README.md index 595b94f..d414f02 100644 --- a/workers/hello_world/README.md +++ b/workers/hello_world/README.md @@ -40,7 +40,7 @@ This will allow your IDE or VSCode with `pyright` plugin to find any type errors You can also install `pyright` with `sudo npm install -g pyright` and run `pyright` in the root of the project to find any type errors. -#### data_types.py: Contains data types representing model API endpoints +### data_types.py: Contains data types representing model API endpoints This file defines the structure of the data your model server expects (its API contract) and, critically, how PyWorker *interprets* that data for autoscaling purposes. You define Python data classes that mirror the JSON payloads your model's API uses. @@ -112,7 +112,7 @@ class InputData(ApiPayload): ``` -#### server.py: Creating Your Model's API Endpoints +### server.py: Creating Your Model's API Endpoints This section guides you through creating the core of your custom model API: the `EndpointHandler`. Think of `EndpointHandler` as the bridge between incoming requests from users and your underlying model. It's the key to making your model accessible and scalable. @@ -300,7 +300,7 @@ if __name__ == "__main__": start_server(backend, routes) ``` -#### test_load.py +### test_load.py Here you can create a script that allows you test an endpoint group running instances with this PyWorker