Buying Compute

Topping up your account

  1. Coming soon

Access via chat interface:

  1. Click Buy > AI Chat when logged in. Inference is billed for the currently loaded model.

Access via openai-compatible endpoint:

  1. Click Buy > API Key when logged in
  2. Copy the API key and use it wherever you would use an OpenAI API key
  3. Change the base url to

Some codebases append the /v1 in code, so you may need to try just

Supported interfaces

  • Openai compatible Fine Tuning, Chat Inference and Embeddings are currently supported

Model names

  • Use hugging face model names, or your own fine-tuned model name (for now)
  • Feel free to use a publicly available URL to any GGUF file for an inference model name instead (our workers will download/run it)
  • Use fastembed:BAAI/bge-base-en-v1.5 (or any other fastembed model) for very fast /v1/embeddings endpoint use
  • Use repo/model:filter format for any hugging face model.
    • A worker will load your model.
    • First time usage will be slow as the model propagates to different workers
    • The more you use it, the faster it will get (will gain priority over time)
  • For example: TheBloke/zephyr-7B-beta-GGUF:Q5_K_M works fine, and is comparable, in many ways, to gpt3

Python example code:

import openai

# Set up the OpenAI client

openai.api_key = "YOUR_ACCESS_TOKEN"
openai.api_base = ""

# Use the model with the chat completion endpoint

response = openai.ChatCompletion.create(
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Translate the following English text to French: 'Hello World'"},


Node example code:

const OpenAIApi = require('openai');

const openai = new OpenAIApi({
  baseURL: ''

async function getCompletion() {
  const response = await{
    model: "TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M",
    messages: [
      {role: "system", content: "You are a helpful assistant."},
      {role: "user", content: "Translate the following English text to French: 'Hello World'"}