The Easiest Way to Manage Cloud GPU Infrastructure

Develop, train, and deploy AI models on your cloud or ours. No code changes needed.

Reproducible Dev Environments

Provision a cloud GPU machine with your environment, code, and SSH keys ready to go.

Connect to it through SSH or your favorite IDE, or open it inside a Jupyter notebook.

~/project/dev.yaml
resources:
  accelerators: A100:1
workdir: ~/my_project
setup:
  pip install -r requirements.txt
>_
komo machine launch ~/project/dev.yaml --name dev
~/project/train.yaml
resources:
  accelerators: A100:8
workdir: ~/my_project
setup:
  pip install -r requirements.txt
run:
  python train.py --ngpus 8
>_
komo job launch ~/project/train.yaml

Serverless Jobs

Launch batch jobs for tasks such as training, fine-tuning, and data processing. Easily scale to multiple nodes for distributed jobs.

Once your job completes, the cloud instances will automatically be terminated. Never pay for idle instances ever again.

Infinitely Scalable Models

Deploy AI models behind a safe and secure endpoint. With built-in load balancing and autoscaling, your models scale with traffic, ensuring you only pay for the compute you need.

Use the serving framework of your choice (vLLM, Triton, etc.) for maximum flexibility.

~/project/serve.yaml
resources:
  accelerators: A100:1
  ports: 8000
workdir: ~/my_project
envs:
  HF_TOKEN: MY_TOKEN
setup:
  pip install -r requirements.txt
run:
  python -u -m vllm.entrypoints.openai.api_server \
    --port 8000 \
    --model meta-llama/Meta-Llama-3-8B-Instruct \
    ---trust-remote-code \
    --gpu-memory-utilization 0.95 \
    --max-num-seqs 64
service:
  pip install -r requirements.txt
>_
komo service launch ~/project/serve.yaml --name llama3

Multi-Cloud Execution

Use the Komodo Cloud or bring your own cloud account to consume your cloud credits and keep your data within your own private VPC.

Connect multiple cloud accounts for maximum GPU availability at ultra-competitive prices. And with Kubernetes support, you can seamlessly overflow from your on-prem cluster to the cloud.