API Service Template
FastAPI + ML Models template for production-ready AI services
Quick Start
# Clone template git clone https://github.com/wearehybrid/api-template.git cd api-template # Create virtual environment python -m venv venv source venv/bin/activate # Install dependencies pip install -r requirements.txt # Set environment variables cp .env.example .env # Start development server uvicorn main:app --reload --host 0.0.0.0 --port 8000
Features
🚀 Production Ready
- • Async request handling
- • Automatic API documentation
- • Request validation with Pydantic
- • Error handling & logging
- • Health checks & monitoring
🤖 AI Integration
- • Model loading & caching
- • Batch inference support
- • Multiple model backends
- • GPU acceleration
- • Model versioning
API Architecture
📡 Client
HTTP Requests
⚡ FastAPI
Request Processing
🧠 Models
AI Inference
📊 Response
JSON Results
Code Structure
api-service/ ├── app/ │ ├── api/ │ │ ├── v1/ │ │ │ ├── endpoints/ │ │ │ │ ├── predict.py │ │ │ │ ├── health.py │ │ │ │ └── models.py │ │ │ └── api.py │ │ └── deps.py │ ├── core/ │ │ ├── config.py │ │ ├── logging.py │ │ └── security.py │ ├── models/ │ │ ├── ml_models.py │ │ └── schemas.py │ └── services/ │ ├── prediction.py │ └── model_manager.py ├── tests/ ├── docker/ ├── requirements.txt └── main.py
Key Components
Main Application (main.py)
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.api.v1.api import api_router
from app.core.config import settings
app = FastAPI(
title="AI API Service",
description="Production-ready AI inference API",
version="1.0.0",
openapi_url="/api/v1/openapi.json"
)
# CORS middleware
app.add_middleware(
CORSMiddleware,
allow_origins=settings.ALLOWED_HOSTS,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Include API routes
app.include_router(api_router, prefix="/api/v1")
@app.on_event("startup")
async def startup_event():
# Load ML models on startup
from app.services.model_manager import ModelManager
await ModelManager.load_models()Prediction Endpoint
from fastapi import APIRouter, Depends, HTTPException
from app.models.schemas import PredictionRequest, PredictionResponse
from app.services.prediction import PredictionService
router = APIRouter()
@router.post("/predict", response_model=PredictionResponse)
async def predict(
request: PredictionRequest,
service: PredictionService = Depends()
):
try:
result = await service.predict(
data=request.data,
model_name=request.model_name
)
return PredictionResponse(
prediction=result.prediction,
confidence=result.confidence,
processing_time=result.processing_time
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))Request/Response Schemas
Request Schema
from pydantic import BaseModel
from typing import List, Optional
class PredictionRequest(BaseModel):
data: List[float]
model_name: str = "default"
batch_size: Optional[int] = 1
class Config:
schema_extra = {
"example": {
"data": [1.0, 2.0, 3.0],
"model_name": "classifier",
"batch_size": 1
}
}Response Schema
class PredictionResponse(BaseModel):
prediction: List[float]
confidence: float
processing_time: float
model_version: str
class Config:
schema_extra = {
"example": {
"prediction": [0.8, 0.2],
"confidence": 0.85,
"processing_time": 0.023,
"model_version": "v1.2.0"
}
}Model Management
Model Manager Service
import torch
import asyncio
from typing import Dict, Any
from pathlib import Path
class ModelManager:
_models: Dict[str, Any] = {}
@classmethod
async def load_models(cls):
"""Load all models on startup"""
model_configs = {
"classifier": "models/classifier.pt",
"regressor": "models/regressor.pt"
}
for name, path in model_configs.items():
if Path(path).exists():
model = torch.load(path, map_location='cpu')
model.eval()
cls._models[name] = model
print(f"Loaded model: {name}")
@classmethod
def get_model(cls, name: str):
if name not in cls._models:
raise ValueError(f"Model {name} not found")
return cls._models[name]
@classmethod
def list_models(cls):
return list(cls._models.keys())API Endpoints
Core Endpoints
POST /api/v1/predictGET /api/v1/modelsGET /api/v1/healthGET /api/v1/metrics
Documentation
GET /docs- Swagger UIGET /redoc- ReDocGET /openapi.json- OpenAPI spec
Deployment Options
Docker
Containerized deployment
docker build -t ai-api .- • Multi-stage builds
- • GPU support
- • Health checks
Kubernetes
Scalable orchestration
kubectl apply -f k8s/- • Auto-scaling
- • Load balancing
- • Rolling updates
Cloud Run
Serverless deployment
gcloud run deploy- • Pay per request
- • Auto-scaling
- • HTTPS included
Performance & Monitoring
📊
Metrics Collection
Prometheus metrics for request latency, throughput, and error rates
🚀
Caching Layer
Redis caching for frequently requested predictions
⚡
Async Processing
Background task queues for batch processing
Build Your AI API
Get started with our production-ready FastAPI template