Copilot Workspace

What can I help with?

Local + cloud AI orchestration. The CEO model routes your prompt to the best muscle automatically.

Generate a cyberpunk image with neon lighting
Write a GPU temperature monitor in Python
Explain LoRA vs full fine-tuning
Create a 30s product launch script
💬 0 messages ~0 tokens
💬
Chat
Conversational — AI selects best model
🤖
Agent
Autonomous multi-step solve with tools
📋
Plan
Generates a plan for your approval first
Non-admin model visibility
Local models via Ollama + ComfyUI • Cloud APIs optional • Your data stays on your machine

⚙️ General

Default routing mode, API keys, and system information.

Default Mode
Which routing mode to use for new chats

API Keys

Loading providers...

System

Hardware
Detecting...
VRAM Usage

🚀 Performance

Fast mode, benchmarks, and active inference parameters.

⚡ Fast Mode
Reduces context window & may lower output quality for faster responses

Performance Targets

Tokens/sec
Target tok/s
16+
Context Window
Max Output
Benchmarks the model currently loaded in VRAM. Mount a model in the Models tab first.

Active Parameters

🤖 Models

Install, manage, and monitor local LLM models via Ollama.

Install a New Model

R&D Dashboard

Choose a model that fits your 12GB VRAM. Models download from Ollama's registry automatically.

Or type a custom model name:

Running Models (in VRAM)

Loading...

Installed Models

Loading...

Chat Proven Methods (Admin)

Toggle admin/user visibility and review provider + buy/sell costs + credits from the variable-driven catalog.

Default Model Service Buy $ Sell $ Credits User Admin
Loading chat model rules...

🔧 Advanced

Ollama inference parameters and system diagnostics.

Ollama Parameters

Fine-tune individual inference parameters. Changes apply to all future requests.

System Info

GPU Layers
num_gpu = 99 → all layers on GPU, never CPU offload
99
Model Storage
~/.ollama/models (default Ollama location)
Loaded Models

🔌 AI Backends

Configure local and cloud AI providers. Only enabled backends are used by smart routing.

Smart Routing
Automatically pick the best model from enabled backends
Loading backends...

Cloud Tools / Extensions

☁️ Claude Code CLI
Route tasks or individual swarm nodes to Anthropic's Claude Code CLI instead of Ollama.
Requires claude CLI installed and authenticated via claude /login.
🤖 GitHub Copilot SDK
Access GitHub Copilot's model fleet (GPT-5, Claude, Gemini, Grok and more) via the copilot Python SDK.
Requires the SDK installed in .venv and authenticated via copilot login.
RotorQuant KV & llama.cpp
Serve KV-cache-compressed models via llama-server on port 8095.
Tier 1 — q8_0 KV compression (~40K ctx, ~2× savings) — works now with standard llama.cpp.
Tier 2 — RotorQuant iso3/planar3 (~200K ctx, ~10×) — requires CUDA 12.4 + fork build.
When enabled, models in the registry with backend: llamacpp are routed to http://127.0.0.1:8095 instead of Ollama. A subtle llama.cpp · q8_0 KV badge appears on responses served by llama-server. Disable to force all models through Ollama (e.g. when llama-server is offline).

⚡ Skills

Manage built-in and community skills. The active skill is applied automatically by the CEO router.

Loading skills...

🧠 Reasoning Models

Local and cloud reasoning models for Agent and Plan modes. Models exceeding your hardware are shown with ⚠️.

Show Performance Tips
Display inline tips during chat when using reasoning models

Local Models (Ollama)

Loading...

Cloud Models (API Key Required)

Loading...

🧩 Auto-Solve

Autonomous multi-step task solving. The system reasons, calls tools, and iterates until the task is complete.

Enable Auto-Solve
Master switch — when off, auto-solve requests fall back to normal chat
Reasoning Model
Primary model for auto-solve (must support tool calling or CoT)
Max Iterations
Maximum reasoning steps per solve (hard cap: 25)
Thinking Mode
Enable chain-of-thought reasoning (when reasoning model not used)

Tool Categories

Enable or disable entire categories of tools the auto-solver can use.

🔧 General Tools
Code execution, web search, delegation, planning, math
🎨 Image & Video
Generate, edit, and animate images/videos via ComfyUI
💾 Memory
Search and save facts in the memory sidecar
🔊 Text-to-Speech
Generate speech and narrate videos
📄 File Access
Read workspace files

Resource Management

Auto VRAM Management
Automatically load/unload models during auto-solve

💾 Memory Management

View and manage persistent memory across different scopes: global facts, project-specific context, and conversation history.

🧠 Memory Architecture

Global Memory: User facts (name, job, preferences) saved across all chats
Project Memory: Project-specific context (tech stack, custom rules) isolated per project
Current Chat: Messages in this conversation only
Conversation History: Recent messages from other conversations for context

💾 Global Memory (0 facts)

Loading...

📁 Project Memory

No projects yet. Create a project to see its memory.

⚙️ Data Controls

⚡ Swarm Pipeline

Configure THE MACHINE's self-improvement pipeline — models, context budgets, worker counts, timeouts, and apply behavior.

Loading pipeline settings...
📖 Help Guide

🖥️ Systems Portal

Live reachability status of E-Labs services. Click Refresh to re-probe all ports.

Loading...

📋 Saved Workflows

Reusable workflow templates saved from completed projects. Click Launch to create a new project from any template.

Loading...
Loading workflows...