Job configuration reference

You set these fields when you create a job, in the web UI or with cosmicac jobs create. The job type determines which parameters apply. For the create flow, see Create a GPU Container Job and Create a Managed Inference Job.

Common fields

These fields apply to every job type.

Field	Required	Description
Job type	Yes	The kind of job to create, GPU Container or Managed Inference.
Job name	Yes	A name to identify the job.
Tags	Yes	One or more labels for the job. The CLI accepts a comma-separated list.
Location	Yes	Location code for where the job runs (`us` or `IN`).

GPU configuration

These fields select the job's hardware.

Field	Required	Description
GPU	Yes	GPU model (`GH100_H100_SXM5_80GB`).
GPU count	Yes	Number of GPUs.
CUDA / driver	Yes	GPU driver version. Only `CUDA 12.9` is supported.

GPU Container parameters

These fields apply to a GPU Container Job.

Field	Required	Description
Base OS image	Yes	Base OS image for the container. Only `Ubuntu22.04/CUDA12.9` is supported.
Disk (GB)	Yes	Root disk size in GB. One of `250`, `500`, or `1000`.

Managed Inference (vLLM) parameters

These fields apply to a vLLM Managed Inference Job.

Field	Required	Description
Model	Yes	Hugging Face model ID to serve (`Qwen/Qwen3-32B`).
Runtime image (CUDA)	Yes	Serving runtime image (`vllm-openai-0.8.5`).
Data type	Yes	Numeric precision the model runs at (`BF16` or `Auto`).
Quantisation	Yes	Quantization scheme (`FP8` or `INT8`).
Tensor parallel	Yes	Number of GPUs to split the model across.
GPU memory utilization	Yes	Fraction of GPU memory to use, between `0` and `1`.
Max concurrent sequences	Yes	Maximum requests handled at once.
Max model length	Yes	Maximum model context length.
Reasoning parser	Yes	Parser for the model's reasoning output.
Video & image input	Yes	Whether the model accepts multimodal input. `true` or `false`.
Root disk size	Yes	VM root disk size in GB. One of `250`, `500`, or `1000`.
Environment variables	No	Environment variables passed to the inference service.
Endpoint name	Yes	Name of the inference endpoint. Must be unique across active inference jobs.
Replicas	Yes	Number of endpoint replicas.
Require Authorization header	Yes	Whether callers must send an authorization header. `true` or `false`.

Managed Inference (Parakeet) parameters

These fields apply to a Parakeet Managed Inference Job.

Field	Required	Description
Model	Yes	Parakeet model to serve, `nvidia/parakeet-tdt-0.6b-v3`.
Endpoint name	Yes	Name of the transcription endpoint.
Chunk duration	Yes	Audio chunk length in seconds (`60`).
Chunk overlap	Yes	Overlap between chunks in seconds (`10`).
Max file size (MB)	Yes	Maximum upload size in MB (`2048`).
Require Authorization header	Yes	Whether callers must send an authorization header. `true` or `false`.