时政
财经
科技
虚拟货币
其他
登录
#MLP
关注
向阳乔木
1周前
MLP(最小可爱产品)是一个比MVP更好的标准。 你发布的第一个版本,功能可以少,但它所包含的功能必须让人感到愉悦。 Spotify第一版功能很少,但做到了一件事:点一首歌,几乎瞬间播放。 那个年代,这种体验很牛逼。 MVP问的是 "这个产品能用吗?"。 MLP问的是 "这个产品会让人想告诉朋友吗?"
#MLP
#MVP
#产品设计
#用户体验
分享
评论 0
0
𝙩𝙮≃𝙛{𝕩}^A𝕀²·ℙarad𝕚g𝕞
5个月前
## MLPs can learn in-context (刚才看到一个帖子一个standford phd提到的,手一滑就不见了) One of the most under-rated empirical results of this year was the fact that MLPs can learn in-context [14]. This is surprising because the attention mechanism is usually thought to be the key for this (induction heads in MHSA, etc). I replicated these findings (the in-context regression task in particular) in small MLPs that had just one hidden layer and as few as 32 hidden units, and found the weight matrices learn a fascinating and structured pattern that matches the nature of the task the authors outline in the paper. It showed an interesting mechanism for how MLPs learned the in-context classification and regression tasks outlined in the paper, that amounted roughly to a very clever memorization pattern of the training data. I think the mech interp community would have a blast figuring this out, and I want to flag this empirical phenomenon for them. On a purely architectural level, MLP-only architectures have the benefit of only using compute-intensive matmuls, which keep GPUs fed. But in practice, work like gMLPs [15] shows that adding attention really is necessary to get maximal performance in the end. How does one square these findings with the fact that MLPs can do simple in-context classification and regression tasks? What exactly is then failing in realistic settings making attention necessary? Or are the learned representations on these synthetic tasks not ones that generalize (like induction heads do) to natural language?
#MLP
#in-context learning
#memorization
#attention mechanism
#generalization
分享
评论 0
0
𝙩𝙮≃𝙛{𝕩}^A𝕀²·ℙarad𝕚g𝕞
8个月前
阶跃星辰stepfun可以的,这个step3的部署架构把attention和MLP分开,效率更高! LLM的下一个突破是啥?attention+MLP+?
Google Gemini 2.5发布引发AI模型性价比热议· 475 条信息
OpenAI新德里发布会:ChatGPT语音翻译功能引发热议· 869 条信息
#阶跃星辰
#step3部署架构
#attention
#MLP
#LLM
分享
评论 0
0
个人主页
通知
我的投稿
我的关注
我的拉黑
我的评论
我的点赞