Tag

#Multimodal

9 English Kalera News articles tagged Multimodal — source-backed.

AI Jun 8, 2026

Google Launches Gemini Omni: A Major Step Toward AI 'Creating Anything from Anything'

Gemini Omni is Google's latest multimodal AI model, boasting superior capabilities in understanding and generating video, image, and audio content.

Sources x.com

AI · tools-ai Jun 6, 2026

Kimi Moonshot launches Kimi K2.6 — a multimodal model supporting 300 sub-agents

Kimi Moonshot introduces Kimi K2.6, a multimodal AI agentic model capable of scaling up to 300 sub-agents via Agent Swarm, now available on Together AI.

Sources x.com

AI Jun 6, 2026

Google leaks Gemini Omni — a super-powered video model coming soon at I/O

Gemini Omni is expected to be Google's most advanced video model, capable of professional video editing and a deeper understanding of the visual world.

Sources x.com

AI · tools-ai Jun 5, 2026

Experience Gemini Omni Flash — the next-generation multimodal model on YouTube and Gemini

Users can now try Gemini Omni Flash, the first model in the multimodal Omni family, across Google's platforms.

Sources x.com

AI May 29, 2026

NVIDIA Nemotron-3 Nano Omni Is Now Available on Microsoft Azure Foundry

NVIDIA Nemotron-3 Nano Omni, an open-source multimodal AI model (unifying video, audio, image, and text), is now available for direct deployment on Microsoft Azure Foundry via Hugging Face.

Sources x.com

AI May 27, 2026

NVIDIA and DeepSeek Race in Large-Context Model Performance for AI Agents

The releases of NVIDIA Nemotron 3 Nano Omni and DeepSeek-V4 mark a significant milestone in ultra-long context processing for multimodal AI agent tasks.

Sources huggingface.co huggingface.co

AI May 27, 2026

Apple Proposes TC-JEPA: Using Text to Help AI Understand Images More Accurately

Apple introduces TC-JEPA, a new self-supervised method that uses text captions to guide and reduce noise during AI image recognition learning.

Sources machinelearning.apple.com

Tech May 23, 2026

ChatGPT now automatically fills out forms using photos and voice 📝

OpenAI has updated ChatGPT with a new feature that automatically fills out various forms using uploaded images combined with text or voice instructions, streamlining paperwork processing.

Sources x.com x.com

AI May 22, 2026

Hugging Face Hub v1.16.0 Released: Powerful Support for Multimodal AI

The latest version of huggingface_hub officially integrates Together Compute as a new Inference provider, supporting five multimodal task types ranging from TTS to Text-to-Video.

Sources x.com