Managed Inference Job
Guides for creating Managed Inference Jobs and connecting clients to them.
Step-by-step guides for common tasks with a Managed Inference Job. These guides assume you already have CosmicAC running. If you do not, follow the installation guide.
Guides
Create a Managed Inference Job (web interface)
Create a Managed Inference Job in the web interface.
Create an API key (web interface)
Create an API key to authenticate requests to an endpoint.
Connect to a Managed Inference endpoint (vLLM)
Point a client at the endpoint and send requests.
Create a Managed Inference Job (Parakeet)
Create a speech-to-text Parakeet job from the CLI.
Transcribe audio with a Parakeet endpoint
Upload an audio file to a Parakeet endpoint and get a transcription.
Check Managed Inference endpoint health
Report the status, uptime, and latency of every endpoint.
Create a model master
Set default serving parameters for a model.