ML/DL2026-02

Smart Subtitle Placement Engine

Deep LearningComputer Vision

Overview

Developed a dynamic caption placement engine for videos with YOLOv8. The engine uses YOLOv8 + saliency map to detect the objects in the video and then places the caption on empty spaces to avoid blocking the action.

Key Results

Dynamically calculates optimal subtitle placement across a 6-zone spatial grid to prevent the occlusion of faces and critical action.

Strict <15% Intersection over Union (IoU) veto threshold to create dynamic "cost heatmaps" across 1080p video frames.

Temporal aggregation algorithm achieved 80% reduction in processing time.

Conversion engine parses standard SRT files to ASS format for seamless integration into video players.

Tech Stack

PythonYOLOv8