Choose a model that fits your 12GB VRAM. Models download from Ollama's registry automatically.
Downloading...0%
Or type a custom model name:
Running Models (in VRAM)
Loading...
Installed Models
Loading...
Chat Proven Methods (Admin)
Toggle admin/user visibility and review provider + buy/sell costs + credits from the variable-driven catalog.
Default
Model
Service
Buy $
Sell $
Credits
User
Admin
Loading chat model rules...
🔧 Advanced
Ollama inference parameters and system diagnostics.
Ollama Parameters
Fine-tune individual inference parameters. Changes apply to all future requests.
System Info
GPU Layers
num_gpu = 99 → all layers on GPU, never CPU offload
99
Model Storage
~/.ollama/models (default Ollama location)
Loaded Models
—
🔌 AI Backends
Configure local and cloud AI providers. Only enabled backends are used by smart routing.
Smart Routing
Automatically pick the best model from enabled backends
Loading backends...
Cloud Tools / Extensions
☁️Claude Code CLI✓ CLI found⚠ CLI not found
Route tasks or individual swarm nodes to Anthropic's Claude Code CLI instead of Ollama.
Requires claude CLI installed and authenticated via claude /login.
⚙️ Setup Required
Install Node.js 18+ from nodejs.org (needed by the CLI)
Install Claude Code CLI: npm install -g @anthropic-ai/claude-code
Authenticate: claude /login
Verify: claude --version
ℹ️ Always install via npm to get the latest stable release. Run npm update -g @anthropic-ai/claude-code to update later.
🔐 Authentication needed
CLI found but not authenticated. Click below — a terminal window will open. Follow the instructions, then it will close when done.
Per-node framework: When Claude Code is enabled, individual swarm nodes can be set to use it
via the Swarm Pipeline tab — each model setting gets a "Claude Code" option.
🤖GitHub Copilot SDK✓ Authenticated⚠ Not authenticatedSDK not installed
Access GitHub Copilot's model fleet (GPT-5, Claude, Gemini, Grok and more) via the copilot Python SDK.
Requires the SDK installed in .venv and authenticated via copilot login.
⚙️ SDK Installation Required
Make sure you're in the project venv: .venv\Scripts\activate
SDK found but not authenticated. Click below — a terminal window will open. A browser will launch there to authorize your GitHub account, then the window closes automatically. Requires an active GitHub Copilot subscription.
Tiers explained:Nemo (gpt-5-mini, 0× multiplier — free).
Standard (gpt-5.1, 1× premium).
Deep (claude-opus-4.6, 3× premium).
The tier picker also appears in the chat toolbar when GitHub Copilot mode is active.
⚡RotorQuant KV & llama.cpp✓ Server onlineServer offline
Serve KV-cache-compressed models via llama-server on port 8095. Tier 1 — q8_0 KV compression (~40K ctx, ~2× savings) — works now with standard llama.cpp. Tier 2 — RotorQuant iso3/planar3 (~200K ctx, ~10×) — requires CUDA 12.4 + fork build.
When enabled, models in the registry with backend: llamacpp are routed to http://127.0.0.1:8095 instead of Ollama.
A subtle ⚡llama.cpp · q8_0 KV badge appears on responses served by llama-server.
Disable to force all models through Ollama (e.g. when llama-server is offline).
⚡ Skills
Manage built-in and community skills. The active skill is applied automatically by the CEO router.
✨ Generate New Skill with AI
Describe what the skill should do. The AI will generate a working skill and install it automatically.
Loading skills...
🧠 Reasoning Models
Local and cloud reasoning models for Agent and Plan modes. Models exceeding your hardware are shown with ⚠️.
Show Performance Tips
Display inline tips during chat when using reasoning models
Local Models (Ollama)
Loading...
Cloud Models (API Key Required)
Loading...
🧩 Auto-Solve
Autonomous multi-step task solving. The system reasons, calls tools, and iterates until the task is complete.
Enable Auto-Solve
Master switch — when off, auto-solve requests fall back to normal chat
Reasoning Model
Primary model for auto-solve (must support tool calling or CoT)
Max Iterations
Maximum reasoning steps per solve (hard cap: 25)
Thinking Mode
Enable chain-of-thought reasoning (when reasoning model not used)
Tool Categories
Enable or disable entire categories of tools the auto-solver can use.
🔧 General Tools
Code execution, web search, delegation, planning, math
🎨 Image & Video
Generate, edit, and animate images/videos via ComfyUI
💾 Memory
Search and save facts in the memory sidecar
🔊 Text-to-Speech
Generate speech and narrate videos
📄 File Access
Read workspace files
Resource Management
Auto VRAM Management
Automatically load/unload models during auto-solve
💾 Memory Management
View and manage persistent memory across different scopes: global facts, project-specific context, and conversation history.
🧠 Memory Architecture
Global Memory: User facts (name, job, preferences) saved across all chats Project Memory: Project-specific context (tech stack, custom rules) isolated per project Current Chat: Messages in this conversation only Conversation History: Recent messages from other conversations for context
💾 Global Memory (0 facts)
Loading...
📁 Project Memory
No projects yet. Create a project to see its memory.
⚙️ Data Controls
⚡ Swarm Pipeline
Configure THE MACHINE's self-improvement pipeline — models, context budgets, worker counts, timeouts, and apply behavior.