"Parameter Golf" competition attracts over 2,000 submissions on AI optimization
The Parameter Golf event successfully concluded with thousands of creative ideas on AI model optimization, including quantization, TTT LoRA, and SSMs.
The Parameter Golf event successfully concluded with thousands of creative ideas on AI model optimization, including quantization, TTT LoRA, and SSMs.
The new AutoTTS framework enables large language models to automatically search for optimal inference strategies, cutting token consumption by up to 69.5% while enhancing problem-solving performance.
The llama.cpp b9235 release introduces Speculative N-gram Tuning, significantly optimizing decode speeds when running large models like Qwen3.6 27B.
The Permutation-Invariant Bayesian Optimization (PIBO) algorithm improves wind turbine placement and cuts computation time in half using Optimal Transport theory.
UniScale is an online framework that unifies model routing and test-time scaling into a single optimization space, achieving a better balance between quality and cost.
ECC provides a collection of skills, commands, and hooks that help optimize token usage, enhance security, and boost productivity when using Claude Code.
RAG (Retrieval-Augmented Generation) improves the accuracy of large language models by enabling direct retrieval from trusted external data sources.
Transformer Reparameterizations Lab has released new reparameterization techniques to optimize training and inference performance for the Transformer architecture.
A new technique leveraging the CLC work-stealing mechanism enables CUDA Graph compatibility for grouped_gemm implementations, optimizing computational performance for complex AI models.