Job configuration reference
Fields you set when you create a GPU Container Job or a Managed Inference Job.
You set these fields when you create a job, in the web UI or with cosmicac jobs create. The job type determines which parameters apply. For the create flow, see Create a GPU Container Job and Create a Managed Inference Job.
Common fields
These fields apply to every job type.
| Field | Required | Description |
|---|---|---|
| Job type | Yes | The kind of job to create, GPU Container or Managed Inference. |
| Job name | Yes | A name to identify the job. |
| Tags | Yes | One or more labels for the job. The CLI accepts a comma-separated list. |
| Location | Yes | Location code for where the job runs (us or IN). |
GPU configuration
These fields select the job's hardware.
| Field | Required | Description |
|---|---|---|
| GPU | Yes | GPU model (GH100_H100_SXM5_80GB). |
| GPU count | Yes | Number of GPUs. |
| CUDA / driver | Yes | GPU driver version. Only CUDA 12.9 is supported. |
GPU Container parameters
These fields apply to a GPU Container Job.
| Field | Required | Description |
|---|---|---|
| Base OS image | Yes | Base OS image for the container. Only Ubuntu22.04/CUDA12.9 is supported. |
| Disk (GB) | Yes | Root disk size in GB. One of 250, 500, or 1000. |
Managed Inference (vLLM) parameters
These fields apply to a vLLM Managed Inference Job.
| Field | Required | Description |
|---|---|---|
| Model | Yes | Hugging Face model ID to serve (Qwen/Qwen3-32B). |
| Runtime image (CUDA) | Yes | Serving runtime image (vllm-openai-0.8.5). |
| Data type | Yes | Numeric precision the model runs at (BF16 or Auto). |
| Quantisation | Yes | Quantization scheme (FP8 or INT8). |
| Tensor parallel | Yes | Number of GPUs to split the model across. |
| GPU memory utilization | Yes | Fraction of GPU memory to use, between 0 and 1. |
| Max concurrent sequences | Yes | Maximum requests handled at once. |
| Max model length | Yes | Maximum model context length. |
| Reasoning parser | Yes | Parser for the model's reasoning output. |
| Video & image input | Yes | Whether the model accepts multimodal input. true or false. |
| Root disk size | Yes | VM root disk size in GB. One of 250, 500, or 1000. |
| Environment variables | No | Environment variables passed to the inference service. |
| Endpoint name | Yes | Name of the inference endpoint. Must be unique across active inference jobs. |
| Replicas | Yes | Number of endpoint replicas. |
| Require Authorization header | Yes | Whether callers must send an authorization header. true or false. |
Managed Inference (Parakeet) parameters
These fields apply to a Parakeet Managed Inference Job.
| Field | Required | Description |
|---|---|---|
| Model | Yes | Parakeet model to serve, nvidia/parakeet-tdt-0.6b-v3. |
| Endpoint name | Yes | Name of the transcription endpoint. |
| Chunk duration | Yes | Audio chunk length in seconds (60). |
| Chunk overlap | Yes | Overlap between chunks in seconds (10). |
| Max file size (MB) | Yes | Maximum upload size in MB (2048). |
| Require Authorization header | Yes | Whether callers must send an authorization header. true or false. |