Back to Projects
AI Models2023-12

Product Feature Extraction from E-Commerce Images

LLMHugging FaceAI model fine-tuning
Product Feature Extraction from E-Commerce Images

Overview

Extract product features such as weight, height, and other details from product page images (e.g., Amazon) using three different pipelines: PaddleOCR with regex, MiniCPM VQA with regex, and Donut VQA with regex

Key Results

Cross-analyzed the efficacy of an MLLM pipeline to extract entity information from product page images.
Trained this model on multidimensional text and image input, improving performance against a purely image input by 12%.
Compared implementations against OCR, Transformer, and Regex-based implementations and observed a 10% improvement in generation time per image.

Tech Stack

PythonPaddleOCRMiniCPMDonutHugging FacePyTorch