Akisamb@programming.devM to Machine Learning@programming.dev · 7 months agoQwen1.5-MoE-A2.7B: A Small MoE Model with only 2.7B Activated Parameters yet Matching the Performance of State-of-the-Art 7B modelswww.marktechpost.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQwen1.5-MoE-A2.7B: A Small MoE Model with only 2.7B Activated Parameters yet Matching the Performance of State-of-the-Art 7B modelswww.marktechpost.comAkisamb@programming.devM to Machine Learning@programming.dev · 7 months agomessage-square0fedilink