EAX Router ALPHA
An experimental, intelligent, and pluggable routing layer for Large Language Models. Research into dynamic model selection based on cost, latency, or quality priorities.
โ ๏ธ Experimental Research - This project explores intelligent routing concepts and will remain permanently in alpha status.
๐ง Intelligent Model Selection
Research into how applications can dynamically choose the optimal LLM for each task based on priority settings - cost, latency, or quality optimization.
๐ Pluggable Routing Strategies
Experimental framework for different routing approaches: CostOptimized, LatencyOptimized, QualityOptimized, with interfaces for custom strategies.
๐ Model Fingerprinting
Research into standardized model capability descriptions through fingerprint.json files, enabling community-driven benchmarking and routing intelligence.
๐ Provider Agnostic
Experimental design to work across any LLM provider - OpenAI, Anthropic, Google, or open-source models via Ollama and other platforms.
โก Lightweight Design
Research into minimal overhead routing that adds intelligence without significant latency or complexity to existing application stacks.
๐ฌ Open Research
Community-driven model performance data collection and sharing, making routing decisions smarter through collaborative benchmarking efforts.
Research Hypothesis
Can a smart routing layer enable developers to use the right model for the right job automatically, reducing costs and vendor lock-in?
from eax_router import Router, Task
# 1. Initialize the experimental router
router = Router()
# 2. Define your task
task = Task(
prompt="Summarize this 500-word article into three bullet points.",
priority="cost" # or "latency", or "quality"
)
# 3. Intelligent routing!
model_choice = router.route(task)
# >>> Model chosen: 'claude-3-haiku-20240307' (cost-optimized)
Fingerprint Standard Research
Exploring an open standard for LLM capability description to enable intelligent routing decisions.
{
"model_id": "claude-3-haiku-20240307",
"cost_per_million_tokens": { "input": 0.25, "output": 1.25 },
"avg_latency_ms_per_1k_tokens": 350,
"context_window": 200000,
"performance_by_task": {
"summarization": { "quality_score": 8.5 },
"code_generation": { "quality_score": 6.0 }
}
}