Run bigger local AI models with the computers you already own over WiFi.
RiftIcon links your PC, laptop, and phone into one private neural mesh. Pool your VRAM to run Llama 3 70B and DeepSeek R1 — with built-in AI agents, a Canvas code editor, and one-click model downloads. No cloud APIs, no subscription fees, no data leaving your network.
Break the VRAM Limit
Stop fighting llama.cpp out-of-memory errors. Pool your gaming PC, your Macbook, and
your phone into a single logical GPU.
Run the full Llama 3 70B offline using hardware you already
paid for.
$0 API Costs
Prototyping AI features shouldn't drain your runway. Turn 5 old office laptops into a localized compute cluster. Test multi-agent workflows continuously for exactly $0 per token.
100% Data Privacy
Working under NDA or HIPAA? Don't send client data to OpenAI. RiftIcon runs an air-gapped, fully encrypted local mesh that guarantees your intellectual property never leaves the network.
Node Setup
Download and run the RiftIcon binary on your main PC. Then, load the lightweight RiftNode companion app on any additional devices (Macbook, old gaming rigs, or Android phones).
Device Pooling
The nodes automatically discover each other over your local WiFi network. RiftIcon aggregates all available VRAM and System RAM across devices into a single, logical compute pool.
Model Execution
Select a GGUF model via the local dashboard. The system intelligently slices the tensor layers and distributes the compute payload across the mesh for high-speed, 100% offline inference.
All tests run on local RiftMesh V1.4.4 over standard 5GHz WiFi. Zero cloud APIs. Zero CPU fallback.
| Hardware Mesh | Model (GGUF) | Quant | Speed |
|---|---|---|---|
| 1x RTX 4060 (8GB) | Qwen 2.5 3B | Q4_K_M | 81.5 tok/s |
| 2x RTX 4060s (WiFi) | DeepSeek R1 8B | Q4_K_M | 43.0 tok/s |
| 2x 4060 + 1x Mac M3 | Llama 3 14B | Q5_K_M | 28.2 tok/s |
| 4-Node Mixed Office Mesh | Llama 3 70B | Q4_K_M | 12.4 tok/s |
RiftIcon spins up a local server on port 7117
that perfectly mocks the OpenAI interface. Connect Cursor, OpenWebUI, AnythingLLM, or
LangChain directly to your local hardware mesh with zero code changes.
UI Clients (OpenWebUI, etc.)
Override the Base URL in settings.
http://127.0.0.1:7117/v1
sk-rifticon
Python / LangChain
Use the standard `openai` pip package.
from openai import OpenAI
# Point client to the local mesh
client = OpenAI(
base_url="http://127.0.0.1:7117/v1",
api_key="sk-rifticon"
)
resp = client.chat.completions.create(
model="14B",
messages=[{"role": "user", "content": "Hello mesh!"}]
)
Apple Silicon is the Muscle. Windows is the Brain.
Because Apple's M-Series chips use Unified Memory, RiftIcon treats system RAM as dedicated GPU VRAM. A MacBook Pro with 32GB of RAM effectively becomes a massive 32GB VRAM inference card when connected to your mesh.
Our core inference engine—the complex "Brain" that manages the mesh—runs exclusively on your main Windows PC. To tap into your MacBook's massive memory pool, simply drop the lightweight RiftNode companion app onto your Mac. It instantly beams its compute power and Unified Memory back to your Windows host over WiFi.
🤖 Your AI, Your Rules — Multiple Agents, One Dashboard
RiftIcon isn't just a chatbot — it's an agent platform. Create specialized AI agents with custom system prompts, sandboxed tools, multi-channel routing, and cron triggers. Switch between them instantly from the sidebar.
🎨 Tell Your AI to Build It — Watch It Appear in Real-Time
Canvas gives you a split code editor with live preview, powered entirely by your local mesh. Ask your AI to build a website, script, or app component — and see the result rendered instantly. No copy-pasting, no tab switching.
📦 Browse & Download Models — No Terminal Required
The built-in Model Hub lets you browse, filter, and download GGUF models directly from the dashboard. See recommended models for your VRAM, pick a quantization, and click download — you're running in seconds.
Only 500 Founder's licenses will be available at launch. Waitlist members get priority access.
Your support directly funds the next phase of the ultimate plug-and-play agent stack.
I built RiftIcon because I was tired of paying for cloud AI that censors my outputs and harvests my data. I had three machines collecting dust and realized their combined VRAM could run the models I actually wanted. So I wrote the orchestrator myself — in Rust, from scratch. Now I'm sharing it with you.
Every license directly funds the next phase. You're not buying from a corporation — you're backing an engineer who uses this tool every day.