🧪 EXPERIMENTAL LLM ROUTING

EAX Router ALPHA

An experimental, intelligent, and pluggable routing layer for Large Language Models. Research into dynamic model selection based on cost, latency, or quality priorities.

⚠️ Experimental Research - This project explores intelligent routing concepts and will remain permanently in alpha status.

🧠 Intelligent Model Selection

Research into how applications can dynamically choose the optimal LLM for each task based on priority settings - cost, latency, or quality optimization.

🔌 Pluggable Routing Strategies

Experimental framework for different routing approaches: CostOptimized, LatencyOptimized, QualityOptimized, with interfaces for custom strategies.

📊 Model Fingerprinting

Research into standardized model capability descriptions through fingerprint.json files, enabling community-driven benchmarking and routing intelligence.

🌐 Provider Agnostic

Experimental design to work across any LLM provider - OpenAI, Anthropic, Google, or open-source models via Ollama and other platforms.

⚡ Lightweight Design

Research into minimal overhead routing that adds intelligence without significant latency or complexity to existing application stacks.

🔬 Open Research

Community-driven model performance data collection and sharing, making routing decisions smarter through collaborative benchmarking efforts.

Research Hypothesis

Can a smart routing layer enable developers to use the right model for the right job automatically, reducing costs and vendor lock-in?

# Experimental Usage Pattern

from eax_router import Router, Task

# 1. Initialize the experimental router

router = Router()

# 2. Define your task

task = Task(

    prompt="Summarize this 500-word article into three bullet points.",

    priority="cost"  # or "latency", or "quality"

)

# 3. Intelligent routing!

model_choice = router.route(task)

# >>> Model chosen: 'claude-3-haiku-20240307' (cost-optimized)

Fingerprint Standard Research

Exploring an open standard for LLM capability description to enable intelligent routing decisions.

# fingerprint.json Research Format

{

  "model_id": "claude-3-haiku-20240307",

  "cost_per_million_tokens": { "input": 0.25, "output": 1.25 },

  "avg_latency_ms_per_1k_tokens": 350,

  "context_window": 200000,

  "performance_by_task": {

    "summarization": { "quality_score": 8.5 },

    "code_generation": { "quality_score": 6.0 }

  }

}

Explore the Research Back to Experiments