A full cognitive system,
not a chatbot wrapper.

Cashmere combines autonomous operation, deep personalization, and complete privacy into a single system that runs on hardware you own. Free and open source.

Autonomous Operation

24/7 Daemon

A background daemon that never sleeps. It schedules tasks, monitors feeds, conducts research, and prepares reports — all without being prompted. Wake up to completed work every morning.

Built on a priority-weighted task queue with configurable scheduling. Tasks are automatically generated from your goals, calendar, and incoming signals.

Proactive Research

Cashmere doesn't wait for you to ask. It identifies knowledge gaps, researches topics relevant to your goals, and synthesizes findings into actionable summaries.

Uses multi-step search with source triangulation. Research results are stored in your knowledge graph for future reference.

Self-Evolution

The agent reviews its own outputs, identifies patterns in what works and what doesn't, and adjusts its strategies accordingly. It literally gets better at helping you over time.

An evolver module periodically audits task outcomes, prompt effectiveness, and user feedback to refine the agent's approach.

Deep Personalization

Conversational Memory

Every conversation is analyzed for facts, preferences, and context. Cashmere builds a persistent memory store that grows with every interaction.

Memory extraction runs automatically after each conversation. Facts are deduplicated, timestamped, and cross-referenced with your knowledge graph.

Knowledge Graph

Entities, relationships, and concepts from your data are mapped into a graph structure. This enables Cashmere to make connections you might miss.

Built on SQLite with vector embeddings for semantic similarity. The graph grows organically as Cashmere processes your conversations and browsing history.

Browsing History Integration

A lightweight Chrome extension syncs your browsing patterns. Cashmere uses this to understand your interests, track research threads, and surface relevant content.

Privacy-first design: data stays local. The extension sends URLs and titles to your local Cashmere instance — nothing leaves your network.

Privacy & Control

100% Local Execution

Models run via Ollama on your Apple Silicon. Your prompts, data, and model weights never leave your machine. There is no cloud component.

Supports any Ollama-compatible model. Default configuration uses Gemma 4 26B for reasoning and nomic-embed-text for embeddings.

Encrypted Credentials

API keys, passwords, and tokens are stored in an encrypted vault. Cashmere can authenticate to your services without exposing credentials in plain text.

AES-256 encryption with a master key derived from your system keychain. TOTP support for 2FA-protected services.

No Telemetry

Zero phone-home behavior. No analytics, no usage tracking, no crash reports. The only network traffic is what you explicitly configure.

Open source means you can verify this claim yourself. Audit the code, run it behind a firewall, monitor the traffic.

Extensibility

Tool Framework

Add custom tools by subclassing a simple Python base class. Cashmere's agent can call your tools during reasoning — filesystem operations, API calls, database queries, anything.

Tools are registered at startup and exposed to the LLM via structured function calling. Built-in tools cover filesystem, search, memory, planning, and more.

Worker System

Background workers process tasks from a priority queue. Add custom workers for your specific workflows — data processing, content generation, monitoring.

Workers subclass BaseWorker and declare a task_type. The daemon scheduler assigns tasks automatically based on priority and cooldown intervals.

Multi-Agent Architecture

Specialized agents handle different aspects of cognition — routing, retrieval, compilation, response generation. Add your own agents for domain-specific reasoning.

Agents subclass BaseAgent and declare their model, prompt template, and available tools. The router agent delegates to specialists based on query analysis.

Ready to run your own agent?

Free, open source, MIT. Clone the repo and you're running in two minutes.

Get it on GitHub Install in 2 minutes

A full cognitive system, not a chatbot wrapper.