Marlin-2B: An ultra-compact Vision-Language Model for video information extraction
Marlin-2B is an open-source VLM with only 2 billion parameters that offers powerful video analysis capabilities, competing directly with larger models like Gemini-2.5-flash.
The AI community has just welcomed Marlin-2B, an ultra-compact, open-source Vision-Language Model (VLM) specially optimized for extracting structured information from video data.
Key Developments
Marlin-2B has been fine-tuned to answer two core questions that developers typically need when processing video: \