Hapax is a production-ready AI infrastructure layer that ensures uninterrupted AI operations through intelligent provider management and automatic failover. Named after the Greek word ἅπαξ (meaning "once"), it embodies our core promise: configure once, then let it seamlessly manage your AI infrastructure.
Organizations face several critical challenges in managing their AI infrastructure. Service disruptions from AI provider outages create direct revenue impacts, while engineering teams dedicate significant resources to managing multiple AI providers. Teams struggle with limited visibility into AI usage across departments, compounded by complex integration requirements spanning different AI providers.
Hapax delivers a robust infrastructure layer through three core capabilities:
The system ensures continuous service through real-time health monitoring with configurable timeouts and check intervals. Automatic failover between providers maintains zero downtime, while a sophisticated three-state circuit breaker (closed, half-open, open) with configurable thresholds prevents cascade failures. Request deduplication using the singleflight pattern optimizes resource utilization.
The architecture prioritizes reliability through high-performance request routing and load balancing. Comprehensive error handling and request validation ensure data integrity, while structured logging with request tracing enables detailed debugging. Configurable timeout and rate limiting mechanisms protect system resources.
Security is foundational, implemented through API key-based authentication and comprehensive request validation and sanitization. The monitoring system provides granular usage tracking per endpoint and detailed request logging for operational visibility.
Hapax provides built-in monitoring capabilities through Prometheus integration, offering comprehensive visibility into your AI infrastructure:
Monitor API usage through versioned endpoints:
# Standard endpoint structure
/v1/completions
/health # Global system health status
/v1/health # Versioned API health status
/metrics
The monitoring system tracks essential metrics including request counts and status by endpoint, request latencies, active request volume, error rates by provider, and circuit breaker states. Health check performance metrics and request deduplication statistics provide deep insights into system efficiency.
Each metric is designed for operational visibility:
hapax_http_requests_total
tracks request volume by endpoint and statushapax_http_request_duration_seconds
measures request latencyhapax_http_active_requests
shows current load by endpointhapax_errors_total
monitors error rates by typecircuit_breaker_state
indicates provider health statushapax_health_check_duration_seconds
validates provider responsivenesshapax_deduplicated_requests_total
confirms request efficiencyhapax_rate_limit_hits_total
tracks rate limiting by client
Security is enforced through API key-based authentication, with per-endpoint rate limiting and comprehensive request validation and sanitization.
// Example: Completion Request
{
"messages": [
{"role": "system", "content": "You are a customer service assistant."},
{"role": "user", "content": "I need help with my order #12345"}
]
}
When your primary provider experiences issues, Hapax:
- Detects the failure through continuous health checks (1-minute intervals)
- Activates the circuit breaker after 3 consecutive failures
- Routes traffic to healthy backup providers in preference order
- Maintains detailed metrics for operational visibility
Deploy Hapax in minutes with our production-ready container:
docker run -p 8080:8080 \
-e OPENAI_API_KEY=your_key \
-e ANTHROPIC_API_KEY=your_key \
-e CONFIG_PATH=/app/config.yaml \
teilomillet/hapax:latest
Default configuration is provided but can be customized via config.yaml
:
server:
port: 8080
read_timeout: 30s
write_timeout: 45s
max_header_bytes: 2097152 # 2MB
shutdown_timeout: 30s
http3: # Optional HTTP/3 support
enabled: true
port: 443 # Default HTTPS/QUIC port
tls_cert_file: "/path/to/cert.pem"
tls_key_file: "/path/to/key.pem"
idle_timeout: 30s
max_bi_streams_concurrent: 100
max_uni_streams_concurrent: 100
max_stream_receive_window: 6291456 # 6MB
max_connection_receive_window: 15728640 # 15MB
circuitBreaker:
maxRequests: 100
interval: 30s
timeout: 10s
failureThreshold: 5
providerPreference:
- ollama
- anthropic
- openai
Hapax provides comprehensive integration capabilities through multiple components:
The API architecture provides dedicated endpoints for core functionalities:
/v1/completions
handles AI completions,/v1/health
provides versioned API health monitoring,/health
offers global system health status./metrics
exposes Prometheus metrics for comprehensive monitoring.
The monitoring infrastructure integrates Prometheus metrics across all critical components, enabling detailed tracking of request latencies, circuit breaker states, provider health status, and request deduplication. This comprehensive approach ensures complete operational visibility.
The health monitoring system operates with enterprise-grade configurability. Check intervals default to one minute with adjustable timeouts, while failure thresholds are tuned to prevent false positives. Health monitoring extends from individual providers to Docker container status, with granular per-provider health tracking.
System integrity is maintained through multiple safeguards: request deduplication prevents redundant processing, automatic failover ensures continuous operation, circuit breaker patterns protect against cascade failures, and structured JSON logging with correlation IDs enables thorough debugging.
The server supports both HTTP/1.1 and HTTP/3 (QUIC) protocols:
- HTTP/1.1 for universal compatibility
- HTTP/3 for improved performance:
- Reduced latency through 0-RTT connections
- Better multiplexing with independent streams
- Improved congestion control
- Automatic connection migration
- Built-in TLS 1.3 encryption
Running Hapax requires:
- Docker-compatible environment with network access to AI providers
- 1GB RAM minimum (4GB recommended for production)
- TLS certificates for HTTP/3 support (if enabled)
- Access credentials (API keys) for supported providers: OpenAI, Anthropic, etc.
Comprehensive documentation is available through multiple resources. The Quick Start Guide provides initial setup instructions, while detailed information about the API and security measures can be found in the API Documentation and Security Overview. For operational insights, consult the Monitoring Guide.
Licensed under Apache 2.0. See LICENSE for details.
For detailed technical specifications, visit our Technical Documentation.