☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 days agoThe Attention Mechanism Born for Cost Optimizationplus-squareoilbeater.comexternal-linkmessage-square0linkfedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkThe Attention Mechanism Born for Cost Optimizationplus-squareoilbeater.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 days agomessage-square0linkfedilink
thickertoofan@lemm.eeEnglish · 4 days agodcdaML - devanagari character detection dataset training frameworkplus-squaregithub.comexternal-linkmessage-square5linkfedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkdcdaML - devanagari character detection dataset training frameworkplus-squaregithub.comthickertoofan@lemm.eeEnglish · 4 days agomessage-square5linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 10 days agoNeural Graffiti is an experiment in adding a "Spray Layer" to a transformer model, which injects a memory trace into the final stages of inference without finetuning or retrainingplus-squaregithub.comexternal-linkmessage-square0linkfedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkNeural Graffiti is an experiment in adding a "Spray Layer" to a transformer model, which injects a memory trace into the final stages of inference without finetuning or retrainingplus-squaregithub.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 10 days agomessage-square0linkfedilink
fubarx@lemmy.world · 17 days agoBreaking GPT-5 News!plus-squaremessage-squaremessage-square0linkfedilinkarrow-up12arrow-down12
arrow-up10arrow-down1message-squareBreaking GPT-5 News!plus-squarefubarx@lemmy.world · 17 days agomessage-square0linkfedilink
4Robato@lemmy.worldEnglish · edit-217 days agoI want to open source a dataset but I'm not sure what license to useplus-squaremessage-squaremessage-square4linkfedilinkarrow-up14arrow-down10
arrow-up14arrow-down1message-squareI want to open source a dataset but I'm not sure what license to useplus-square4Robato@lemmy.worldEnglish · edit-217 days agomessage-square4linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 20 days agoWhy do LLMs make stuff up? New research peers under the hood.plus-squarearstechnica.comexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkWhy do LLMs make stuff up? New research peers under the hood.plus-squarearstechnica.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 20 days agomessage-square0linkfedilink
oba@lemmy.worldEnglish · 29 days agoMLOps tips I gathered recentlyplus-squarewww.readyforagents.comexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkMLOps tips I gathered recentlyplus-squarewww.readyforagents.comoba@lemmy.worldEnglish · 29 days agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 months agoDeepSeek open source DeepEP – library for MoE training and Inferenceplus-squaregithub.comexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkDeepSeek open source DeepEP – library for MoE training and Inferenceplus-squaregithub.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 months agoTowards Monosemanticity: Decomposing Language Models With Dictionary Learningplus-squaretransformer-circuits.pubexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTowards Monosemanticity: Decomposing Language Models With Dictionary Learningplus-squaretransformer-circuits.pub☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 months agoScaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnetplus-squaretransformer-circuits.pubexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkScaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnetplus-squaretransformer-circuits.pub☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoDeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learningarxiv.orgexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkDeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learningarxiv.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agoNeurosymbolic AI -- Why, What, and Howplus-squarearxiv.orgexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkNeurosymbolic AI -- Why, What, and Howplus-squarearxiv.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agoClassical Sorting Algorithms as a Model of Morphogenesis: self-sorting arrays reveal unexpected competencies in a minimal model of basal intelligenceplus-squarearxiv.orgexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkClassical Sorting Algorithms as a Model of Morphogenesis: self-sorting arrays reveal unexpected competencies in a minimal model of basal intelligenceplus-squarearxiv.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agoGenie 2: A large-scale foundation world modelplus-squaredeepmind.googleexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkGenie 2: A large-scale foundation world modelplus-squaredeepmind.google☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agoA good primer on what to expect running local LLMsplus-squarenullprogram.comexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkA good primer on what to expect running local LLMsplus-squarenullprogram.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agomessage-square0linkfedilink
Shamar@feddit.itEnglish · edit-26 months agoA community statement supporting the Open Source Definition (OSD)plus-squareosd.fyiexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkA community statement supporting the Open Source Definition (OSD)plus-squareosd.fyiShamar@feddit.itEnglish · edit-26 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agoHow ‘Embeddings’ Encode What Words Meanplus-squarewww.quantamagazine.orgexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkHow ‘Embeddings’ Encode What Words Meanplus-squarewww.quantamagazine.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agoNew AI model “learns” how to simulate Super Mario Bros. from video footageplus-squarearstechnica.comexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkNew AI model “learns” how to simulate Super Mario Bros. from video footageplus-squarearstechnica.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agoReflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o)plus-squarehuggingface.coexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkReflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o)plus-squarehuggingface.co☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 8 months agoIt’s Not Intelligent If It Always Halts: A Critical Perspective on Current Approaches to AGIplus-squarewww.lifeiscomputation.comexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkIt’s Not Intelligent If It Always Halts: A Critical Perspective on Current Approaches to AGIplus-squarewww.lifeiscomputation.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 8 months agomessage-square0linkfedilink