Silent Bird
0 关注者
红网-南方网
4个月前
学习时节|总书记谈精神文明建设
Go
8个月前
Haha, deepseek r1 is using a modified BoN-RL replacing BoN with Group mean advantage was. And Kimi is taking the formulation of BoN it self. Amazing to see those model become life