pyworker

Files

T

Abiola Akinnubi b1ca68c349 Added endpoint flexibility along with existing log. extended the log support

Switched Endpoint back to vast-ai, Added endpoint flexibility along with existing log. extended the log support

Modify the endpoint return type as optional and check via pyright to ensure there are not compilation/type errors

2025-05-30 14:40:42 -07:00

__init__.py

initial commit

2024-09-04 11:19:30 -07:00

client.py

initial commit

2024-09-04 11:19:30 -07:00

data_types.py

initial commit

2024-09-04 11:19:30 -07:00

README.md

initial commit

2024-09-04 11:19:30 -07:00

server.py

Added endpoint flexibility along with existing log. extended the log support

2025-05-30 14:40:42 -07:00

test_load.py

Merge pull request #1 from Nader-gator/main

2024-09-12 11:27:48 -07:00

README.md

This is the base PyWorker for TGI, designed to create PyWorkers that can utilize various LLMs. It offers two primary endpoints:

generate: Generates the LLM's response to a given prompt in a single request.
generate_stream: Streams the LLM's response token by token.

Both endpoints use the following API payload format:

{
  "inputs": "PROMPT",
  "parameters": {
    "max_new_tokens": 250
  }
}

Note that the max_new_tokens parameter, rather than the prompt size, impacts performance. For example, if an instance is benchmarked to process 100 tokens per second, a request with max_new_tokens = 200 will take approximately 2 seconds to complete.