It's free.
All of it. Forever.
No tiers, no accounts, no per-token fees, no usage caps, no surprise bills. Run unlimited inference on hardware you own.
- ✓ All features, no restrictions
- ✓ Full source code (MIT license)
- ✓ Community support via GitHub
- ✓ Self-managed updates
- ✓ Run on your own hardware
- ✓ No telemetry, no phone-home
What you'd pay otherwise
Equivalent always-on agent usage on cloud APIs runs into thousands per year. Cashmere runs on electricity.
Frequently asked questions
Really? Completely free?
Yes. MIT-licensed source code, every feature included, no paid tier, no usage caps, no accounts. We don't sell hardware and we don't sell subscriptions.
What hardware do I need?
Any Mac with Apple Silicon (M1 or later) and at least 16GB of unified memory. A Mac Mini is ideal — low power, small footprint, runs headless. You buy the machine yourself from Apple or anywhere else; we don't resell it.
What models does it run?
By default, Cashmere uses Gemma 4 26B for reasoning and nomic-embed-text for embeddings, both via Ollama. You can swap in any Ollama-compatible model.
How does this stay free?
It's a personal tool we built for ourselves and chose to open source. No VC money, no growth metrics, no monetization roadmap. If that ever changes, the existing MIT-licensed code stays MIT-licensed forever — fork it and keep going.
Is there really no cloud component?
Correct. Zero cloud dependencies. The only network traffic is what you explicitly configure — fetching web search results, checking RSS feeds, calling external APIs you wired in yourself. All LLM inference happens locally.
How is this different from running ChatGPT locally?
Cashmere isn't a chat interface — it's an autonomous agent. It has a daemon that works 24/7, a memory system that learns from your data, a knowledge graph, a tool framework, and multi-agent orchestration. The chat interface is just one way to interact with it.
Can I support the project?
Star the repo, file good bug reports, contribute code, write a skill, tell a friend. That's the whole ask.