时政
财经
科技
登录
#RuleBasedAI
关注
Herrington Darkholme
5个月前
rule based reward model also means their training target would be limited to domains with ground truth. It is interesting how they can extend to questions with ambiguous, but comparable, answers
#RuleBasedAI
#RewardModel
#MachineLearning
#ambiguity
#GroundTruth
分享
评论 0
0
个人主页
通知
我的投稿
我的关注
我的拉黑
我的评论
我的点赞