Buying Compute
Under construction
Topping up your account
- Coming soon
Access via chat interface:
- Click Buy > AI Chat when logged in. Inference is billed for the currently loaded model.
Access via openai-compatible endpoint:
- Click Buy > API Key when logged in
- Copy the API key and use it wherever you would use an OpenAI API key
- Change the base url to
https://queenbee.gputopia.ai/v1
Some codebases append the /v1 in code, so you may need to try just https://queenbee.gputopia.ai
Supported interfaces
- Openai compatible Fine Tuning, Chat Inference and Embeddings are currently supported
Model names
- Use hugging face model names, or your own fine-tuned model name (for now)
- Feel free to use a publicly available URL to any GGUF file for an inference model name instead (our workers will download/run it)
- Use fastembed:BAAI/bge-base-en-v1.5 (or any other fastembed model) for very fast /v1/embeddings endpoint use
- Use repo/model:filter format for any hugging face model.
- A worker will load your model.
- First time usage will be slow as the model propagates to different workers
- The more you use it, the faster it will get (will gain priority over time)
- For example: TheBloke/zephyr-7B-beta-GGUF:Q5_K_M works fine, and is comparable, in many ways, to gpt3
Python example code:
import openai
# Set up the OpenAI client
openai.api_key = "YOUR_ACCESS_TOKEN"
openai.api_base = "https://queenbee.gputopia.ai/v1"
# Use the model with the chat completion endpoint
response = openai.ChatCompletion.create(
model="TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Translate the following English text to French: 'Hello World'"},
]
)
print(response.choices[0].message['content'].strip())
Node example code:
const OpenAIApi = require('openai');
const openai = new OpenAIApi({
apiKey: 'YOUR_GPUTOPIA_API_KEY',
baseURL: 'https://queenbee.gputopia.ai/v1'
});
async function getCompletion() {
const response = await openai.chat.completions.create({
model: "TheBloke/vicuna-7B-v1.5-GGUF:Q4_K_M",
messages: [
{role: "system", content: "You are a helpful assistant."},
{role: "user", content: "Translate the following English text to French: 'Hello World'"}
]
});
console.log(response.choices[0].message.content.trim());
}
getCompletion();