EVOLVING_AGENTS_LABS

EVOLVING AGENTS LABS

On-device adaptive AI. Four research experiments mapping a biological cognitive stack onto local LLMs.

$ we are permanently alpha. The thesis matters more than backwards compatibility — at this stage.

▸ what we build

We build a biological cognitive stack on top of local LLMs. Three brain regions, four repos, one runtime substrate.

PREFRONTAL CORTEX ┄→ skillos · desktop · markdown OS
skillos_mini · mobile · trade-app
CEREBELLUM ┄→ RoClaw · embodied · 20cm cube robot
KERNEL ┄→ llm_os · LLM-as-CPU · 13-opcode ISA

▸ the four active projects

Each card focuses one monolith in the 3D scene above. Hover to spotlight; click the title to open the repository.

skillos cortex · desktop

A pure markdown operating system designed to run on multiple AI runtime engines. Compatible with Claude Code, Gemini CLI, and Qwen Code. Define an agent once with a high-end model, execute it forever on a 2 GB local model.

$ skillos execute "Run the RoClaw Dream Consolidation scenario"
[OK] markdown agents       ▸ RoClawNavigationAgent · RoClawSceneAnalysisAgent
[OK] dream consolidation   ▸ traces/sim3d/*.md  →  strategies/*.md
[OK] evolving memory       ▸ persistent · weighted by fidelity

skillos_mini cortex · mobile

The mobile-first, on-device port — a Capacitor + Svelte 5 Android application driving Gemma 4 E2B vision locally via LiteRT. First commercial vertical: a trade-app for Spanish-speaking tradespeople (electricista, plomero, pintor) that captures photos, runs vision diagnosis on-device, generates a branded PDF, and shares it by WhatsApp. No backend.

[OK] cartridges    ▸ trade-electricista · trade-plomero · trade-pintor
[OK] vision        ▸ gemma-4-e2b.litertlm  · on-device · ~200 ms
[OK] tests         ▸ 278 / 278 green  ·  svelte-check 0 errors

RoClaw cerebellum · embodied

A 20 cm cube robot powered by ESP32-S3 + stepper motors, driven by a vision language model running on-device. Five-tier stack: ReflexGuard collision veto at L0, an 8-byte UDP ISA at L1, Qwen3-VL-2B distilled from Gemini Robotics-ER at L2, scene graph + reactive controller at L3, skillos planning at L4. The robot dreams overnight and hot-swaps the new GGUF into Ollama by morning.

$ npm run sim:3d -- --ollama --goal "navigate to the red cube"
[OK] qwen3-vl-2b           ▸ on-device · 200 ms / frame
[OK] scene_graph           ▸ ReactiveController · ReflexGuard shadow
[OK] esp32-s3              ▸ udp :4210 · 8-byte ISA v2 · ack 38ms

llm_os kernel · ground-up

Not an agent framework. An operating system architecture where the LLM itself plays the role of the CPU. 13-opcode ISA over GBNF (sampler-enforced — invalid sequences are physically impossible to emit). KV cache as RAM. Compaction as swap. Logit bias as Ring 0/Ring 3 privileges. 8 Hz on a Pi 5; 200 Hz on cloud.

$ ./scripts/quickstart.sh
[OK] llama.cpp     ▸ qwen-2.5-3b-q4 · 4×NEON · ctx=8192
[OK] grammar       ▸ isa.gbnf · 13 opcodes · 12/12 fixtures green
[OK] iod daemon    ▸ /dispatch /trace /policy
[OK] cartridges    ▸ 6 mounted · system · io · sim · domestic

▸ how the four fit together

Each layer is independently shippable and swappable. The goal is a self-improving loop: every navigation produces a trace, every trace becomes training data, every overnight cycle ends with a better model swapped into the local runtime.

                ┌──────────────────┐
                │  USER · WhatsApp │
                │  voice · CLI     │
                └────────┬─────────┘
                         │ "Go check the kitchen"
                         ▼
   ┌────────────────────────────────────┐
   │  skillos · skillos_mini            │   ◄  ARES ORANGE  ◄  cortex
   │  prefrontal cortex                 │
   │  plan · learn · dream              │
   └────────────┬───────────────────────┘
                │ HTTP
                ▼
   ┌────────────────────────────────────┐
   │  llm_os                            │   ◄  WHITE-ON-BLACK ◄  kernel
   │  kernel · LLM as CPU               │
   │  ISA · GBNF · iod · KV             │
   └────────────┬───────────────────────┘
                │ <|call|>roclaw.forward
                ▼
   ┌────────────────────────────────────┐
   │  RoClaw                            │   ◄  MCP GREEN  ◄  cerebellum
   │  cerebellum · embodied             │
   │  perceive · move                   │
   └────────────┬───────────────────────┘
                │ UDP :4210
                ▼
            ESP32-S3 · 20cm cube
        

▸ design principles

  1. 1 · ON-DEVICE FIRST. The cloud is opt-in. No project here defaults to a cloud model. Every primitive runs on Pi 5 / Snapdragon NPU first.
  2. 2 · MARKDOWN AS DATA. Agent prompts, strategies, traces, dreams — all .md files the runtime walks. Reviewable. Diffable. No hidden state.
  3. 3 · DETERMINISTIC SAFETY. Validators are code, not prompts. ReflexGuard, GBNF, IEC 60364 — the LLM proposes; deterministic logic enforces.
  4. 4 · SWAPPABLE LAYERS. Cortex doesn't know what cerebellum is running. Each layer is a contract; implementations come and go.
  5. 5 · PERMANENTLY ALPHA. We optimize for thesis clarity over API stability. Breaking changes are how we learn.

▸ status

  skillos        ▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰     stable          v0.X
  skillos_mini   ▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰      feat-complete   v0.1.0
  RoClaw         ▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰        active research
  llm_os         ▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰         v0.5-rc1 → v1.0
        

Each repo has its own ARCHITECTURE.md, USAGE.md, TUTORIAL.md, and NEXT_STEPS.md.