Image Analysis Template

Python + OpenCV template for real-time image processing and analysis

Quick Start

# Clone template
git clone https://github.com/wearehybrid/vision-template.git
cd vision-template

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run example
python examples/object_detection.py

Features

🎯 Detection & Recognition

• Object detection (YOLO)
• Face recognition
• Text extraction (OCR)
• Barcode/QR code reading
• Custom model integration

🔧 Processing Pipeline

• Real-time video processing
• Batch image processing
• Image enhancement filters
• Multi-format support
• GPU acceleration

Code Structure

vision-app/
├── src/
│   ├── detectors/
│   │   ├── yolo_detector.py
│   │   ├── face_detector.py
│   │   └── ocr_detector.py
│   ├── processors/
│   │   ├── image_processor.py
│   │   └── video_processor.py
│   ├── utils/
│   │   ├── image_utils.py
│   │   └── model_utils.py
│   └── api/
│       └── vision_api.py
├── models/
│   ├── yolo/
│   └── custom/
├── examples/
│   ├── object_detection.py
│   ├── face_recognition.py
│   └── batch_processing.py
└── requirements.txt

Key Components

Object Detection (src/detectors/yolo_detector.py)

import cv2
import numpy as np
from ultralytics import YOLO

class YOLODetector:
    def __init__(self, model_path='yolov8n.pt'):
        self.model = YOLO(model_path)
    
    def detect(self, image):
        results = self.model(image)
        detections = []
        
        for result in results:
            boxes = result.boxes
            for box in boxes:
                x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
                confidence = box.conf[0].cpu().numpy()
                class_id = int(box.cls[0].cpu().numpy())
                
                detections.append({
                    'bbox': [x1, y1, x2, y2],
                    'confidence': confidence,
                    'class': self.model.names[class_id]
                })
        
        return detections

Image Processor

class ImageProcessor:
    def __init__(self):
        self.detectors = {
            'yolo': YOLODetector(),
            'face': FaceDetector(),
            'ocr': OCRDetector()
        }
    
    def process_image(self, image_path, tasks=['yolo']):
        image = cv2.imread(image_path)
        results = {}
        
        for task in tasks:
            if task in self.detectors:
                results[task] = self.detectors[task].detect(image)
        
        return results
    
    def annotate_image(self, image, detections):
        for detection in detections:
            x1, y1, x2, y2 = detection['bbox']
            label = f"{detection['class']}: {detection['confidence']:.2f}"
            
            cv2.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
            cv2.putText(image, label, (int(x1), int(y1-10)), 
                       cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
        
        return image

Usage Examples

Single Image Analysis

from src.processors import ImageProcessor

processor = ImageProcessor()
results = processor.process_image(
    'input.jpg', 
    tasks=['yolo', 'face', 'ocr']
)

print(f"Found {len(results['yolo'])} objects")
print(f"Found {len(results['face'])} faces")
print(f"Extracted text: {results['ocr']}")

Real-time Video

from src.processors import VideoProcessor

processor = VideoProcessor()
processor.process_stream(
    source=0,  # Webcam
    tasks=['yolo'],
    display=True,
    save_output='output.mp4'
)

Model Integration

Pre-trained Models

• YOLOv8 (Ultralytics)
• OpenCV DNN
• MediaPipe
• Tesseract OCR

Custom Models

• PyTorch integration
• TensorFlow support
• ONNX runtime
• Custom training

Cloud APIs

• Google Vision API
• AWS Rekognition
• Azure Computer Vision
• OpenAI Vision

Performance Optimization

⚡

GPU Acceleration

Leverage CUDA for faster inference on NVIDIA GPUs

📦

Batch Processing

Process multiple images simultaneously for better throughput

🔄

Model Optimization

Quantization and pruning for faster inference

Build Your Vision App

Get started with our production-ready computer vision template

Start Your Project View on GitHub