Buying Compute
Under construction
Topping up your account
- Coming soon
Access via chat interface:
- Click Buy > AI Chat when logged in. Inference is billed for the currently loaded model.
Access via openai-compatible endpoint:
- Click Buy > API Key when logged in
- Copy the API key and use it wherever you would use an OpenAI API key
- Change the base url to
https://queenbee.gputopia.ai/v1
Some codebases append the /v1 in code, so you may need to try just https://queenbee.gputopia.ai
Supported interfaces
- Openai compatible Fine Tuning, Chat Inference and Embeddings are currently supported
Model names
- Use hugging face model names, or your own fine-tuned model name (for now)
- Feel free to use a publicly available URL to any GGUF file for an inference model name instead (our workers will download/run it)
- Use fastembed:BAAI/bge-base-en-v1.5 (or any other fastembed model) for very fast /v1/embeddings endpoint use
- Use repo/model:filter format for any hugging face model.
- A worker will load your model.
- First time usage will be slow as the model propagates to different workers
- The more you use it, the faster it will get (will gain priority over time)
- For example: TheBloke/zephyr-7B-beta-GGUF:Q5_K_M works fine, and is comparable, in many ways, to gpt3
Python example code:
import openai # Set up the OpenAI client openai.api_key = "YOUR_ACCESS_TOKEN" openai.api_base = "https://queenbee.gputopia.ai/v1" # Use the model with the chat completion endpoint response = openai.ChatCompletion.create( model="TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Translate the following English text to French: 'Hello World'"}, ] ) print(response.choices[0].message['content'].strip())
Node example code:
const OpenAIApi = require('openai'); const openai = new OpenAIApi({ apiKey: 'YOUR_GPUTOPIA_API_KEY', baseURL: 'https://queenbee.gputopia.ai/v1' }); async function getCompletion() { const response = await openai.chat.completions.create({ model: "TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M", messages: [ {role: "system", content: "You are a helpful assistant."}, {role: "user", content: "Translate the following English text to French: 'Hello World'"} ] }); console.log(response.choices[0].message.content.trim()); } getCompletion();