Business DeepMind’s new inference-time scaling technique improves planning accuracy in LLMs 2 weeks ago
Business Google DeepMind researchers introduce new benchmark to improve LLM factuality, reduce hallucinations 4 weeks ago
Business Self-invoking code benchmarks help you decide which LLMs to use for your programming tasks 4 weeks ago
Business A new benchmark for AI investment: Swift Ventures unveils system to separate talk from action 2 months ago
Business Sakana AI’s CycleQD outperforms traditional fine-tuning methods for multi-skill language models 2 months ago
Business Alibaba releases Qwen with Questions, an open reasoning model that beats o1-preview 2 months ago
Business Alibaba researchers unveil Marco-o1, an LLM with advanced reasoning capabilities 2 months ago
Business OpenScholar: The open-source A.I. that’s outperforming GPT-4o in scientific research 3 months ago