Hapax provides a reliable API layer for interacting with LLM providers. This documentation covers the available endpoints, authentication, request/response formats, and error handling.
All API requests require authentication using an API key. Include your API key in the request headers:
X-API-Key: your_api_key_here
The completion API supports text completion, chat completion, and function calling.
Process completion requests with support for simple text input, chat messages, and function calling.
{
"messages": [
{
"role": "user",
"content": "Hello, how can I help you today?"
}
],
"input": "Alternative simple text input",
"function_description": "Optional function description for function calling"
}
Parameters:
messages
(array, optional): Array of message objects withrole
andcontent
. Required ifinput
is not provided.role
(string): Message role ("user", "assistant", "system")content
(string): Message content
input
(string, optional): Simple text input for backward compatibility. Required ifmessages
is not provided.function_description
(string, optional): Description of the function for function calling requests.
{
"completion": "Response text from the LLM",
"request_id": "unique-request-id"
}
400 Bad Request
: Invalid request format or missing required fields401 Unauthorized
: Invalid or missing API key429 Too Many Requests
: Rate limit exceeded500 Internal Server Error
: Processing or system error
curl -X POST https://api.hapax.ai/v1/completion \
-H "Content-Type: application/json" \
-H "X-API-Key: your_api_key_here" \
-d '{
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'
All error responses follow a consistent format:
{
"error": {
"type": "ValidationError",
"message": "Detailed error message",
"code": 400,
"request_id": "unique-request-id",
"details": {
"field": "Additional error context"
}
}
}
ValidationError
: Request validation failuresAuthenticationError
: Authentication issuesRateLimitError
: Rate limit exceededProcessingError
: LLM or processing failuresInternalError
: Unexpected system errors
The API implements rate limiting based on your API key. Rate limits are configurable and can be monitored through the provided metrics.
Headers returned with rate limit information:
X-RateLimit-Limit
: Total requests allowed per windowX-RateLimit-Remaining
: Remaining requests in current windowX-RateLimit-Reset
: Time until the rate limit resets (in seconds)
- Request IDs: Include a
X-Request-ID
header for request tracking - Retries: Implement exponential backoff for rate limit errors
- Timeouts: Set appropriate client timeouts (recommended: 30s)
- Monitoring: Use the provided metrics endpoints for monitoring
The API uses URL versioning (e.g., /v1/completion
). Breaking changes will be introduced in new API versions, while the current version will be maintained for backward compatibility.
Hapax provides comprehensive health monitoring through global, route-specific, and provider-level health check endpoints.
Global health check endpoint that returns the status of all services and providers.
{
"status": {
"global": true
},
"services": {
"/v1/completion": "healthy",
"/v1/other-route": "healthy"
},
"providers": {
"openai": {
"healthy": true,
"last_check": "2024-01-01T12:00:00Z",
"consecutive_fails": 0,
"latency_ms": 150,
"error_count": 0,
"request_count": 1000
},
"anthropic": {
"healthy": true,
"last_check": "2024-01-01T12:00:00Z",
"consecutive_fails": 0,
"latency_ms": 200,
"error_count": 0,
"request_count": 500
}
}
}
status.global
(boolean): Overall health statusservices
(object): Status of individual routes- Keys are route paths
- Values are either "healthy" or "unhealthy"
providers
(object): Status of LLM providershealthy
(boolean): Whether the provider is currently operationallast_check
(string): ISO 8601 timestamp of the last health checkconsecutive_fails
(integer): Number of consecutive health check failureslatency_ms
(integer): Last observed latency in millisecondserror_count
(integer): Total number of errors since last healthy staterequest_count
(integer): Total number of requests processed
200 OK
: All services and providers are healthy503 Service Unavailable
: One or more services or providers are unhealthy
Health check endpoint for a specific route.
{
"status": "healthy"
}
status
(string): Either "healthy" or "unhealthy"
200 OK
: Service is healthy503 Service Unavailable
: Service is unhealthy
Health check endpoint for a specific LLM provider.
{
"healthy": true,
"last_check": "2024-01-01T12:00:00Z",
"consecutive_fails": 0,
"latency_ms": 150,
"error_count": 0,
"request_count": 1000
}
200 OK
: Provider is healthy503 Service Unavailable
: Provider is unhealthy404 Not Found
: Provider not found
-
Check Frequency
- Health checks run every minute
- Providers are checked in parallel
- Results are cached until next check
-
Provider Health Checks
- Simple prompt sent to verify provider responsiveness
- 5-second timeout for each check
- Consecutive failures tracked
- Latency monitored
-
Health Status Transitions
- Provider marked unhealthy after any failure
- Error count reset when returning to healthy state
- Metrics updated on status changes
-
Monitoring Integration
- Health status exposed via Prometheus metrics
hapax_provider_healthy
: Gauge (0/1) for each providerhapax_health_check_duration_seconds
: Health check latencyhapax_health_check_errors_total
: Total health check failures
Hapax exposes Prometheus metrics for monitoring and observability.
Returns Prometheus-formatted metrics about the server's operation.
-
HTTP Request Metrics
hapax_http_requests_total
: Total number of HTTP requests by endpoint and statushapax_http_request_duration_seconds
: Duration of HTTP requests in secondshapax_http_active_requests
: Number of currently active HTTP requests
-
Error Metrics
hapax_errors_total
: Total number of errors by type
-
Rate Limiting Metrics
hapax_rate_limit_hits_total
: Total number of rate limit hits by client
-
System Metrics
- Standard Go runtime metrics (memory, goroutines, etc.)
- Process metrics (CPU, file descriptors, etc.)
Plain text in Prometheus exposition format:
# HELP hapax_http_requests_total Total number of HTTP requests by endpoint and status
# TYPE hapax_http_requests_total counter
hapax_http_requests_total{endpoint="/health",status="200"} 42
...
200 OK
: Metrics successfully retrieved
curl http://your-server:8080/metrics
All API requests require authentication using an API key. The API key must be included in the request headers:
X-API-Key: your_api_key_here
Contact your system administrator or use the Hapax management interface to generate API keys.
- Secure Storage: Store API keys securely and never commit them to version control
- Regular Rotation: Rotate API keys periodically (recommended: every 90 days)
- Least Privilege: Use different API keys for different environments (development, staging, production)
- Monitoring: Monitor API key usage through the metrics endpoint
- Revocation: Have a process ready to quickly revoke compromised API keys
- TLS: Always use HTTPS for API requests
- Request IDs: Include unique request IDs for request tracing
- Timeouts: Implement appropriate timeouts to prevent hanging connections
- Rate Limiting: Respect rate limits and implement exponential backoff
- Error Handling: Never expose internal errors to clients
- Input Validation: Validate all input parameters before processing
-
Connection Establishment
- TLS handshake (HTTP/3 with 0-RTT support)
- Connection multiplexing optimization
-
Request Processing
- Authentication verification
- Rate limit checking
- Request validation
- Request queuing (if enabled)
- LLM provider selection
- Response generation
- Response formatting
-
Response Delivery
- Response compression (if enabled)
- Response streaming (for supported endpoints)
Hapax employs several middleware components to ensure reliable and secure API operations:
-
Authentication Middleware
- Validates API keys
- Enforces authentication requirements
- Returns
401 Unauthorized
for invalid keys
-
Request ID Middleware
- Generates unique request IDs
- Accepts client-provided IDs via
X-Request-ID
- Ensures request traceability
-
Rate Limiting Middleware
- Enforces per-key rate limits
- Returns rate limit headers:
X-RateLimit-Limit: 1000 X-RateLimit-Remaining: 999 X-RateLimit-Reset: 1640995200
-
Timeout Middleware
- Enforces request timeouts
- Default timeout: 30 seconds
- Configurable per route
-
Panic Recovery Middleware
- Catches unhandled panics
- Returns
500 Internal Server Error
- Logs error details for debugging
-
Metrics Middleware
- Records request metrics
- Tracks response times
- Monitors error rates
-
Queue Middleware
- Manages request queuing
- Implements fair scheduling
- Prevents system overload
-
CORS Middleware
- Handles cross-origin requests
- Configurable CORS policies
- Pre-flight request handling
All requests are validated for:
-
Content Type
- Must be
application/json
for POST requests - Charset must be UTF-8
- Must be
-
Request Size
- Maximum body size: 1MB
- Configurable per route
-
Required Fields
- All required fields must be present
- Field types must match specifications
-
Input Format
- JSON must be well-formed
- Strings must be valid UTF-8
- Numbers must be within allowed ranges
Official client libraries are available for:
- Python:
hapax-python
- JavaScript/TypeScript:
@hapax/node
- Go:
github.com/teilomillet/hapax/client
Example using the Python client:
from hapax import Client
client = Client(api_key="your_api_key_here")
response = client.completion(
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
print(response.completion)
Current API version: v1
v1
(Current): Initial stable release- Basic completion API
- Health checks
- Metrics endpoint
- Major versions are supported for at least 12 months
- Deprecation notices are announced 6 months in advance
- Security patches are provided for all supported versions