Herrington Darkholme2025-01-26 11:54:03rule based reward model also means their training target would be limited to domains with ground truth. It is interesting how they can extend to questions with ambiguous, but comparable, answers#RuleBasedAI#RewardModel#MachineLearning#ambiguity#GroundTruth
NO CONTEXT HUMANS2025-01-16 12:39:54I’m not saying you should, but I’m also not saying you shouldn’t #advice#decision-making#ambiguity#philosophy#life-choices