Rob Ballantyne
3668d948be
Simplify null pyworker README intro to serverless terminology
...
Drop the "autoscaler provisions a worker if none is free" phrasing in
favor of the simpler "request comes in and you get a worker; release and
it scales back down."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-11 17:02:41 +01:00
Rob Ballantyne
254ccdf181
Add /release control endpoint to null pyworker
...
The held /reserve now waits on an asyncio.Event and resolves when the local
queue consumer POSTs /release on the internal control port (127.0.0.1:18999
by default). This produces a 200 success in metrics instead of the 499
cancellation you got from disconnecting the client. The duration cap stays
as a safety net for stuck consumers.
The internal aiohttp server is now unconditional and hosts /release always;
the stub /health route is added only when BACKEND_HEALTH_URL is unset.
NULL_STUB_HEALTH_PORT is renamed to NULL_CONTROL_PORT to reflect the
broader role.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-11 16:59:46 +01:00
Rob Ballantyne
89761b378a
Wire null pyworker healthcheck to a stub (and optional user URL)
...
Adds an in-process aiohttp stub on 127.0.0.1:18999/health so the framework's
periodic healthcheck has something live to talk to. Operators can override
with BACKEND_HEALTH_URL to point at their queue consumer's /health
endpoint, so the autoscaler marks the worker errored if the consumer dies.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-11 16:53:26 +01:00
Rob Ballantyne
18974873e5
Add null pyworker for queue-driven autoscaling
...
A PyWorker that does not forward to any model server. POST /reserve holds
the worker busy until the client disconnects (or the duration cap elapses),
so users with their own job queue can drive Vast autoscaling without
exposing inbound model traffic on the instance.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-05-11 16:48:52 +01:00
Lucas Armand
9bc9ba11c5
Increase TGI benchmark tokens to 500
2026-04-30 14:04:39 -07:00
LucasArmandVast
48fdc65e3d
Update to vastai package ( #84 )
2026-04-14 10:41:31 -07:00
LucasArmandVast
2cd97315cd
Add nltk requirement for openai worker ( #83 )
...
* Add nltk requirement for openai worker
* pin version
2026-04-13 11:30:06 -07:00
Lucas Armand
83c31e25a9
Add force update detection
2026-03-31 13:46:22 -07:00
Lucas Armand
fbe1dca6fa
more env_path fixes
2026-03-30 16:28:51 -07:00
Lucas Armand
4c3120dbc5
allow override env_path
2026-03-30 16:25:01 -07:00
Lucas Armand
d7d9b915f6
allow break system packages
2026-03-30 16:09:17 -07:00
Lucas Armand
4660b337fb
Check for USE_SYSTEM_PYTHON
2026-03-30 14:46:38 -07:00
edgaratvast
7506ecb6b5
directly invoke one stop shop setup executable exported by vastai pip package for deployments ( #82 )
2026-03-26 10:59:49 -07:00
LucasArmandVast
50633c5003
Update deployments script with retries. ( #81 )
2026-03-23 14:58:32 -07:00
LucasArmandVast
2e8f18276f
Add beta deployments script ( #80 )
2026-03-23 14:14:06 -07:00
Scott Darden
eba9c480eb
Merge pull request #79 from vast-ai/update-requirements
...
Updated requirements to only require vastai-sdk
2026-01-14 12:07:33 -08:00
Lucas Armand
aaca1c9645
Updated requirements to only require vastai-sdk
2026-01-14 10:47:07 -08:00
LucasArmandVast
f319db6bd5
flag for model log rotate ( #78 )
2026-01-12 17:03:18 -08:00
LucasArmandVast
4d786b4d17
SDK Versioning Improvements ( #77 )
...
* Add SDK_BRANCH
2026-01-02 10:23:07 -08:00
LucasArmandVast
bd3e0032a1
Add SDK version checking ( #76 )
2025-12-17 21:01:52 -08:00
Lucas Armand
e02f4bc943
Lowered concurrency of vLLM and TGI benchmarks
2025-12-17 11:55:33 -08:00
Lucas Armand
bcb04b9a32
add missing comma
2025-12-17 11:40:40 -08:00
Lucas Armand
9daf171487
Increase queue limits for vLLM and TGI
2025-12-17 11:38:55 -08:00
LucasArmandVast
29f836eb1a
Backwards compatible vLLM payload ( #75 )
...
* Support old vLLM payloads
2025-12-15 19:58:02 -08:00
LucasArmandVast
4380d98c01
Use PyWorker SDK ( #67 )
...
* Change PyWorker to Worker SDK
* Moved /lib to vast-sdk (https://github.com/vast-ai/vast-sdk )
2025-12-15 19:33:03 -08:00
Abiola Akinnubi
2ce741a8b7
Merge pull request #74 from vast-ai/AUTO-912
...
Mark pyworkers as "Error" if startup script fails. to avoid silent fail that waits for autoscaler.
2025-12-11 17:05:13 -08:00
Abiola Akinnubi
4ecc07032f
Mark pyworkers as "Error" if startup script fails. to avoid silent fail that waits for autoscaler.
2025-12-11 12:51:56 -08:00
edgaratvast
df61e6e946
correct version pin for aiohttp ( #73 )
...
Co-authored-by: Edgar Lin <edgarlin2000@gmail.com >
2025-12-10 19:34:52 -08:00
LucasArmandVast
70f8a8f534
Merge pull request #72 from vast-ai/hotfix-pin-pycares
...
Hotfix: pin pycares
2025-12-10 20:41:44 -05:00
Lucas Armand
7be8aa6397
pin pycares
2025-12-10 17:38:03 -08:00
Colter-Downing
138fc3ac47
Merge pull request #71 from vast-ai/AUTO-comfyui-updates
...
Auto comfyui updates
2025-12-04 10:55:12 -08:00
Colter Downing
222ac2a0dd
default endpoint name
2025-12-04 10:54:55 -08:00
Colter Downing
40aed9b5f8
adding s3 as an option
2025-12-04 10:52:57 -08:00
Colter Downing
d4d36bf86e
done with comfy updates
2025-12-03 20:45:55 -08:00
Colter Downing
e839cfc6e8
include view in API wrapper
2025-12-03 20:22:45 -08:00
Colter Downing
f04138e13b
update to be able to get images
2025-12-03 20:16:25 -08:00
Colter-Downing
de3aa87c8f
Merge pull request #70 from vast-ai/AUTO-tgi-client-edits
...
update tgi client
2025-12-03 18:40:01 -08:00
Colter Downing
6b5b1341a7
update tgi client
2025-12-03 18:38:42 -08:00
Colter-Downing
8be92c03de
Merge pull request #69 from vast-ai/AUTO-874--fix-openai-worker-client
...
defaults to ENDPOINT_NAME and DEFAULT_MODEL but uses the flag first
2025-12-03 16:59:56 -08:00
Colter Downing
adedb8ba90
defaults to ENDPOINT_NAME and DEFAULT_MODEL but uses the flag first if present
2025-12-03 16:57:28 -08:00
LucasArmandVast
2f543c01ad
Merge pull request #68 from vast-ai/fix-vllm-concurrency
...
Increase model wait time for vLLM
2025-12-03 16:13:51 -05:00
Lucas Armand
0bcd2219ea
Increase model wait time for vLLM
2025-12-03 12:38:52 -08:00
LucasArmandVast
0339b471c5
Merge pull request #66 from vast-ai/synthesis
...
PyWorker Error Handling
2025-11-25 16:02:26 -08:00
Lucas Armand
e143162438
bumpy pyworker version
2025-11-25 16:01:23 -08:00
Lucas Armand
7986e51e9e
early errors
2025-11-24 15:24:06 -08:00
Lucas Armand
9c6ab78503
Move model log line
2025-11-24 15:22:23 -08:00
Lucas Armand
45e0c7d9ca
Move model log rotate to top
2025-11-24 15:02:33 -08:00
LucasArmandVast
7a792fd176
Merge pull request #64 from vast-ai/add-llama-log
...
add llama log
2025-11-21 10:24:27 -08:00
Lucas Armand
e0449cb3c7
add llama log
2025-11-21 10:22:16 -08:00
Lucas Armand
a4339bd3f1
hotfix: add f
2025-11-12 16:10:55 -08:00