Buying Compute

Topping up your account

  1. Coming soon

Access via chat interface:

  1. Click Buy > AI Chat when logged in. Inference is billed for the currently loaded model.

Access via openai-compatible endpoint:

  1. Click Buy > API Key when logged in
  2. Copy the API key and use it wherever you would use an OpenAI API key
  3. Change the base url to https://queenbee.gputopia.ai/v1

Some codebases append the /v1 in code, so you may need to try just https://queenbee.gputopia.ai

Supported interfaces

  • Openai compatible Fine Tuning, Chat Inference and Embeddings are currently supported

Model names

  • Use hugging face model names, or your own fine-tuned model name (for now)
  • Feel free to use a publicly available URL to any GGUF file for an inference model name instead (our workers will download/run it)
  • Use fastembed:BAAI/bge-base-en-v1.5 (or any other fastembed model) for very fast /v1/embeddings endpoint use
  • Use repo/model:filter format for any hugging face model.
    • A worker will load your model.
    • First time usage will be slow as the model propagates to different workers
    • The more you use it, the faster it will get (will gain priority over time)
  • For example: TheBloke/zephyr-7B-beta-GGUF:Q5_K_M works fine, and is comparable, in many ways, to gpt3

Python example code:

import openai

# Set up the OpenAI client

openai.api_key = "YOUR_ACCESS_TOKEN"
openai.api_base = "https://queenbee.gputopia.ai/v1"

# Use the model with the chat completion endpoint

response = openai.ChatCompletion.create(
model="TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Translate the following English text to French: 'Hello World'"},
]
)

print(response.choices[0].message['content'].strip())

Node example code:

const OpenAIApi = require('openai');

const openai = new OpenAIApi({
  apiKey: 'YOUR_GPUTOPIA_API_KEY',
  baseURL: 'https://queenbee.gputopia.ai/v1'
});

async function getCompletion() {
  const response = await openai.chat.completions.create({
    model: "TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M",
    messages: [
      {role: "system", content: "You are a helpful assistant."},
      {role: "user", content: "Translate the following English text to French: 'Hello World'"}
    ]
  });

  console.log(response.choices[0].message.content.trim());
}

getCompletion();