Agents defined in markdown.
Evolved through memory.
Running from Claude Code to bare metal.

Six open-source experiments mapping a cognitive stack onto local LLMs — from a Claude Code plugin you install in 10 seconds to a Rust OS where the LLM is the CPU.

Apache 2.0 · permanently alpha · 2026

One thesis, five layers of depth. Each project removes a safety net:

  1. Plugin — Claude does everything. You just give it a goal.
  2. Full system — Any LLM interprets markdown agents. Memory consolidation. Self-optimization.
  3. On-device — A 2B model on a phone. No internet. Deterministic safety checks.
  4. Real world — $30 robot. VLM sees, reasons, drives. Failed runs retry in simulation overnight.
  5. Bare metal — The LLM is the CPU. 14-opcode ISA. GBNF grammar = MMU. Runs on a Pi 5.
Start here — try in 30 seconds

skillos_plugin

Claude Code plugin

Install in Claude Code, type /skillos with a goal. It decomposes the task, creates specialized agents as markdown, executes them, and consolidates learnings into memory. The fastest way to explore SkillOS concepts.

$ /plugin install skillos-plugin

$ /skillos "Build a REST API with auth and tests"
[OK] agents          3 created · triad decomposition
[OK] output          projects/API/output/

$ /skillos dream
[OK] strategies      3 new · 2 updated

Companion to /skillos. The /sysctl command audits, scores, evolves, and prunes what /skillos built. Security scans, 7-type failure taxonomy, anti-overfitting gate, lifecycle management. Never modifies files directly — produces proposals for human approval.

$ /sysctl "audit and score all agents in Project_webapp"
[OK] security        9 checks passed · 0 critical
[OK] scores          AgentA: S(0.92) · AgentB: A(0.78)

$ /sysctl "evolve low-scoring agents"
[OK] proposals       2 minimal-surface edits · awaiting approval
Full system — markdown-defined agent OS

skillos

markdown OS

The full SkillOS. Every component — agents, tools, memory, orchestration — defined in .md files interpreted by any LLM at runtime. HWM hierarchical planning, 14 token-compression dialects, dream consolidation, self-optimization loop. Runs on Claude, Gemini, Gemma 4, or any OpenAI-compatible endpoint.

$ skillos execute: "Run the Operation Echo-Q scenario"
[OK] agents          4 created · quantum theorist → architect
[OK] wiki            5 concept pages · LaTeX
[OK] code            quantum_cepstrum.py · echo error 0.003s

$ skillos execute: "Run the Project Aorta scenario"
[OK] output          36KB clinical vision + 37KB math framework
On-device — same concepts on a phone

skillos_mini

Android · on-device

Markdown cartridges guide a 2B local LLM through trade diagnostics. Electrician (IEC 60364), plumber, painter. The LLM picks links and decides which optional checks to run — all safety rules execute as deterministic TypeScript functions. No internet required. Gemma 4 on-device via LiteRT.

chat: "panel has exposed wiring and no RCD"
[OK] route           electricista → diagnosis_cableado.md
[OK] mandatory       electrical.checkWireGauge → FALLA
[OK] adaptive        safety.checkRCD → PELIGRO
[OK] diagnosis       Recableado 4mm² + diferencial 30mA
Real world — embodied cognition for $30

skillos_robot

robot · VLM

A 20cm robot driven by a vision language model. Give it a goal — it sees through the camera, reasons about the scene, and drives to the target. Dual loop: VLM perception at 1–2 Hz, reactive motor control at 10–20 Hz. Failed traces get retried in MuJoCo simulation during dream consolidation.

$ robot navigate "go to the red cube"
[OK] vlm             gemini-3.1-flash-lite · 200ms/frame
[OK] navigation      target reached · 12 steps

$ robot dream
[OK] retried         3 failed traces in MuJoCo
[OK] strategies      "approach from left" → confidence 0.85
Bare metal — the LLM is the CPU

llm_os

LLM as CPU · Rust

An OS where the LLM is the CPU. 14-opcode ISA enforced by GBNF grammar at decode time — invalid sequences are physically impossible to emit. In-process inference via FFI to llama.cpp. KV cache as RAM with deterministic paging. WASM-sandboxed cartridges. Boots from SD card on a Raspberry Pi 5 at 8 Hz.

$ bash scripts/dev.sh --model qwen2.5-3b-q4.gguf
[OK] sampler         in-process FFI · grammar stack loaded
[OK] cartridges      6 mounted · system · io · domestic

$ iod dispatch "plan a weekly menu for a family of 4"
[OK] opcodes         call → result → call → halt
[OK] trace           14 tokens · 3 syscalls · 1.2s total

$ bash image/build.sh --model qwen2.5-3b-q4.gguf
[OK] image           pi5-sdcard.img · 1.8GB