CosmicAC Logo

Job configuration reference

Fields you set when you create a GPU Container Job or a Managed Inference Job.

You set these fields when you create a job, in the web UI or with cosmicac jobs create. The job type determines which parameters apply. For the create flow, see Create a GPU Container Job and Create a Managed Inference Job.

Common fields

These fields apply to every job type.

FieldRequiredDescription
Job typeYesThe kind of job to create, GPU Container or Managed Inference.
Job nameYesA name to identify the job.
TagsYesOne or more labels for the job. The CLI accepts a comma-separated list.
LocationYesLocation code for where the job runs (us or IN).

GPU configuration

These fields select the job's hardware.

FieldRequiredDescription
GPUYesGPU model (GH100_H100_SXM5_80GB).
GPU countYesNumber of GPUs.
CUDA / driverYesGPU driver version. Only CUDA 12.9 is supported.

GPU Container parameters

These fields apply to a GPU Container Job.

FieldRequiredDescription
Base OS imageYesBase OS image for the container. Only Ubuntu22.04/CUDA12.9 is supported.
Disk (GB)YesRoot disk size in GB. One of 250, 500, or 1000.

Managed Inference (vLLM) parameters

These fields apply to a vLLM Managed Inference Job.

FieldRequiredDescription
ModelYesHugging Face model ID to serve (Qwen/Qwen3-32B).
Runtime image (CUDA)YesServing runtime image (vllm-openai-0.8.5).
Data typeYesNumeric precision the model runs at (BF16 or Auto).
QuantisationYesQuantization scheme (FP8 or INT8).
Tensor parallelYesNumber of GPUs to split the model across.
GPU memory utilizationYesFraction of GPU memory to use, between 0 and 1.
Max concurrent sequencesYesMaximum requests handled at once.
Max model lengthYesMaximum model context length.
Reasoning parserYesParser for the model's reasoning output.
Video & image inputYesWhether the model accepts multimodal input. true or false.
Root disk sizeYesVM root disk size in GB. One of 250, 500, or 1000.
Environment variablesNoEnvironment variables passed to the inference service.
Endpoint nameYesName of the inference endpoint. Must be unique across active inference jobs.
ReplicasYesNumber of endpoint replicas.
Require Authorization headerYesWhether callers must send an authorization header. true or false.

Managed Inference (Parakeet) parameters

These fields apply to a Parakeet Managed Inference Job.

FieldRequiredDescription
ModelYesParakeet model to serve, nvidia/parakeet-tdt-0.6b-v3.
Endpoint nameYesName of the transcription endpoint.
Chunk durationYesAudio chunk length in seconds (60).
Chunk overlapYesOverlap between chunks in seconds (10).
Max file size (MB)YesMaximum upload size in MB (2048).
Require Authorization headerYesWhether callers must send an authorization header. true or false.

On this page