Buying Compute

Topping up your account

  1. Coming soon

Access via chat interface:

  1. Click Buy > AI Chat when logged in. Inference is billed for the currently loaded model.

Access via openai-compatible endpoint:

  1. Click Buy > API Key when logged in
  2. Copy the API key and use it wherever you would use an OpenAI API key
  3. Change the base url to https://queenbee.gputopia.ai/v1

Some codebases append the /v1 in code, so you may need to try just https://queenbee.gputopia.ai

Supported interfaces

  • Openai compatible Fine Tuning, Chat Inference and Embeddings are currently supported

Model names

  • Use hugging face model names, or your own fine-tuned model name (for now)
  • Feel free to use a publicly available URL to any GGUF file for an inference model name instead (our workers will download/run it)
  • Use fastembed:BAAI/bge-base-en-v1.5 (or any other fastembed model) for very fast /v1/embeddings endpoint use
  • Use repo/model:filter format for any hugging face model.
    • A worker will load your model.
    • First time usage will be slow as the model propagates to different workers
    • The more you use it, the faster it will get (will gain priority over time)
  • For example: TheBloke/zephyr-7B-beta-GGUF:Q5_K_M works fine, and is comparable, in many ways, to gpt3

Python example code:

import openai  # Set up the OpenAI client  openai.api_key = "YOUR_ACCESS_TOKEN" openai.api_base = "https://queenbee.gputopia.ai/v1"  # Use the model with the chat completion endpoint  response = openai.ChatCompletion.create( model="TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Translate the following English text to French: 'Hello World'"}, ] )  print(response.choices[0].message['content'].strip())  

Node example code:

const OpenAIApi = require('openai');  const openai = new OpenAIApi({  apiKey: 'YOUR_GPUTOPIA_API_KEY',  baseURL: 'https://queenbee.gputopia.ai/v1' });  async function getCompletion() {  const response = await openai.chat.completions.create({  model: "TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M",  messages: [  {role: "system", content: "You are a helpful assistant."},  {role: "user", content: "Translate the following English text to French: 'Hello World'"}  ]  });   console.log(response.choices[0].message.content.trim()); }  getCompletion();