Tag

#MOE

7 English Kalera News articles tagged MOE — source-backed.

tools-ai · Tech Jun 6, 2026

NVIDIA launches Vera Rubin platform — processing trillion-parameter models at 400 tokens per second

NVIDIA's new Vera Rubin platform, combining NVL72 and Groq 3 LPX, enables running agentic workloads on massive MoE models without sacrificing latency.

Sources x.com

AI · tools-ai Jun 3, 2026

hf-mem tool updates memory estimation feature for MoE models

The hf-mem tool has added a detailed breakdown of memory consumption for Mixture-of-Experts (MoE) models, helping developers optimize their infrastructure strategies.

Sources x.com

AI Jun 2, 2026

JetBrains Launches Mellum2: A Powerful 12B Mixture-of-Experts Model for Coding

JetBrains has introduced Mellum2, a new generation AI model utilizing the Mixture-of-Experts (MoE) architecture with 12 billion parameters, specifically optimized for software development tasks and deeply integrated into IDEs.

Sources huggingface.co

AI May 29, 2026

Liquid AI Launches LFM2.5-8B-A1B: A Highly Optimized MoE Model for Personal Devices 🚀

Liquid AI introduces LFM2.5-8B-A1B, an 8-billion parameter language model with a hybrid MoE architecture, designed specifically for smartphones, laptops, and robots. Featuring a 128K context window, this is a major step forward in bringing high-performance AI directly to edge devices.

Sources x.com

AI May 28, 2026

poolside Launches Laguna: Specialized MoE Models for Coding

poolside has announced the Laguna M.1 and XS.2 model duo based on the Mixture-of-Experts architecture, optimized for coding tasks and long-horizon agents.

Sources arxiv.org

AI May 27, 2026

MiniMax-M2: A 230-Billion-Parameter AI that Only Activates 4% of Its Power

MiniMax has launched the M2 MoE model series with 229.9 billion parameters, optimized for agents and capable of self-debugging its own source code.

Sources arxiv.org

AI · tools-ai May 18, 2026

Optimizing CUDA Graph for Grouped GEMM with CLC Work Stealing

A new technique leveraging the CLC work-stealing mechanism enables CUDA Graph compatibility for grouped_gemm implementations, optimizing computational performance for complex AI models.

Sources x.com