Silent Bird
0 关注者
红网-南方网
9个月前
学习时节|总书记谈精神文明建设
Go
1年前
Haha, deepseek r1 is using a modified BoN-RL replacing BoN with Group mean advantage was. And Kimi is taking the formulation of BoN it self. Amazing to see those model become life