Reproducible Dev Environments

Provision a cloud GPU machine with your environment, code, and SSH keys ready to go.
‍
Connect to it through SSH or your favorite IDE, or open it inside a Jupyter notebook.

Click here to get started!

~/project/dev.yaml


          resources:

            accelerators: A100:1

          workdir: ~/my_project

          setup:

            pip install -r requirements.txt

>_


          komo machine launch ~/project/dev.yaml --name dev

~/project/train.yaml


          resources:

            accelerators: A100:8

          workdir: ~/my_project

          setup:

            pip install -r requirements.txt

          run:

            python train.py --ngpus 8

>_


          komo job launch ~/project/train.yaml

Serverless Jobs

Launch batch jobs for tasks such as training, fine-tuning, and data processing. Easily scale to multiple nodes for distributed jobs.
‍
Once your job completes, the cloud instances will automatically be terminated. Never pay for idle instances ever again.

Launch your first job now!

Infinitely Scalable Models

Deploy AI models behind a safe and secure endpoint. With built-in load balancing and autoscaling, your models scale with traffic, ensuring you only pay for the compute you need.
‍
Use the serving framework of your choice (vLLM, Triton, etc.) for maximum flexibility.

Deploy your first model in minutes!

~/project/serve.yaml


          resources:

            accelerators: A100:1

            ports: 8000

          workdir: ~/my_project

          envs:

            HF_TOKEN: MY_TOKEN

          setup:

            pip install -r requirements.txt

          run:

            python -u -m vllm.entrypoints.openai.api_server \

              --port 8000 \

              --model meta-llama/Meta-Llama-3-8B-Instruct \

              ---trust-remote-code \

              --gpu-memory-utilization 0.95 \

              --max-num-seqs 64

          service:

            pip install -r requirements.txt

>_


          komo service launch ~/project/serve.yaml --name llama3

Multi-Cloud Execution

Use the Komodo Cloud or bring your own cloud account to consume your cloud credits and keep your data within your own private VPC.
‍
Connect multiple cloud accounts for maximum GPU availability at ultra-competitive prices. And with Kubernetes support, you can seamlessly overflow from your on-prem cluster to the cloud.

Komodo AI

The Easiest Way to Manage Cloud GPU Infrastructure

Reproducible Dev Environments

Serverless Jobs

Infinitely Scalable Models

Multi-Cloud Execution