Haha, deepseek r1 is using a modified BoN-RL replacing BoN with Group mean advantage was. And Kimi is taking the formulation of BoN it self. Amazing to see those model become life
Haha, deepseek r1 is using a modified BoN-RL replacing BoN with Group mean advantage was. And Kimi is taking the formulation of BoN it self. Amazing to see those model become life
Olivert
9小时前
多学点人工智能技术知识。 吴恩达老师的机器学习课程个人笔记
howie.serious
1天前
再次感叹: gpt-4.5 太好了。 但迟早会下线。 抓紧机会,要每天用。