AI Models2023-12
Product Feature Extraction from E-Commerce Images
LLMHugging FaceAI model fine-tuning

Overview
Extract product features such as weight, height, and other details from product page images (e.g., Amazon) using three different pipelines: PaddleOCR with regex, MiniCPM VQA with regex, and Donut VQA with regex
Key Results
Cross-analyzed the efficacy of an MLLM pipeline to extract entity information from product page images.
Trained this model on multidimensional text and image input, improving performance against a purely image input by 12%.
Compared implementations against OCR, Transformer, and Regex-based implementations and observed a 10% improvement in generation time per image.
Tech Stack
PythonPaddleOCRMiniCPMDonutHugging FacePyTorch