Go2025-01-22 13:33:33Haha, deepseek r1 is using a modified BoN-RL replacing BoN with Group mean advantage was. And Kimi is taking the formulation of BoN it self. Amazing to see those model become life #文章信息提取#人工智能#深度学习#机器学习#BoN-RL#Group mean#模型#技术讨论