☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 days agoLearning to Reason in 13 Parametersplus-squarearxiv.orgexternal-linkmessage-square0linkfedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkLearning to Reason in 13 Parametersplus-squarearxiv.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 days agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agoAndrej Karpathy — “We’re summoning ghosts, not building animals”plus-squarewww.youtube.comexternal-linkmessage-square0linkfedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkAndrej Karpathy — “We’re summoning ghosts, not building animals”plus-squarewww.youtube.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agomessage-square0linkfedilink
TheracAriane@thebrainbin.org · 5 months agoA dialogue on Machine Learning 🤓🤓🤓plus-squarecodeberg.orgexternal-linkmessage-square0linkfedilinkarrow-up13arrow-down13
arrow-up10arrow-down1external-linkA dialogue on Machine Learning 🤓🤓🤓plus-squarecodeberg.orgTheracAriane@thebrainbin.org · 5 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agoBreathing Life Into Sketches Using Text-to-Video Priorsplus-squarelivesketch.github.ioexternal-linkmessage-square0linkfedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkBreathing Life Into Sketches Using Text-to-Video Priorsplus-squarelivesketch.github.io☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 8 months agoJan v1: 4B open model for web search with 91% SimpleQA, slightly outperforms Perplexity Proplus-squarearxiv.orgexternal-linkmessage-square0linkfedilinkarrow-up16arrow-down10
arrow-up16arrow-down1external-linkJan v1: 4B open model for web search with 91% SimpleQA, slightly outperforms Perplexity Proplus-squarearxiv.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 8 months agomessage-square0linkfedilink
Phil Nelson@lemmy.mlEnglish · 9 months agoOpenCV 4.12.0 Is Now Availableplus-squareopencv.orgexternal-linkmessage-square0linkfedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkOpenCV 4.12.0 Is Now Availableplus-squareopencv.orgPhil Nelson@lemmy.mlEnglish · 9 months agomessage-square0linkfedilink
blue_berry@lemmy.worldEnglish · 9 months agoAnthem Demo - Napster plus Distributed Machine Learningplus-squaremakertube.netexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down12
arrow-up1-1arrow-down1external-linkAnthem Demo - Napster plus Distributed Machine Learningplus-squaremakertube.netblue_berry@lemmy.worldEnglish · 9 months agomessage-square0linkfedilink
A🔻atar of 🔻engeance@lemmy.mlEnglish · 9 months agoAffiliations of the ICML 2025 papersplus-squarelemmy.mlimagemessage-square2linkfedilinkarrow-up12arrow-down10
arrow-up12arrow-down1imageAffiliations of the ICML 2025 papersplus-squarelemmy.mlA🔻atar of 🔻engeance@lemmy.mlEnglish · 9 months agomessage-square2linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 10 months agoThe Bitter Lesson is coming for Tokenizationplus-squarelucalp.devexternal-linkmessage-square0linkfedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkThe Bitter Lesson is coming for Tokenizationplus-squarelucalp.dev☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 10 months agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agoThe Attention Mechanism Born for Cost Optimizationplus-squareoilbeater.comexternal-linkmessage-square0linkfedilinkarrow-up14arrow-down10
arrow-up14arrow-down1external-linkThe Attention Mechanism Born for Cost Optimizationplus-squareoilbeater.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agomessage-square0linkfedilink
thickertoofan@lemm.eeEnglish · 1 year agodcdaML - devanagari character detection dataset training frameworkplus-squaregithub.comexternal-linkmessage-square5linkfedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkdcdaML - devanagari character detection dataset training frameworkplus-squaregithub.comthickertoofan@lemm.eeEnglish · 1 year agomessage-square5linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agoNeural Graffiti is an experiment in adding a "Spray Layer" to a transformer model, which injects a memory trace into the final stages of inference without finetuning or retrainingplus-squaregithub.comexternal-linkmessage-square0linkfedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkNeural Graffiti is an experiment in adding a "Spray Layer" to a transformer model, which injects a memory trace into the final stages of inference without finetuning or retrainingplus-squaregithub.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agomessage-square0linkfedilink
fubarx@lemmy.world · 1 year agoBreaking GPT-5 News!plus-squaremessage-squaremessage-square2linkfedilinkarrow-up16arrow-down110
arrow-up1-4arrow-down1message-squareBreaking GPT-5 News!plus-squarefubarx@lemmy.world · 1 year agomessage-square2linkfedilink
4Robato@lemmy.worldEnglish · edit-21 year agoI want to open source a dataset but I'm not sure what license to useplus-squaremessage-squaremessage-square6linkfedilinkarrow-up13arrow-down10
arrow-up13arrow-down1message-squareI want to open source a dataset but I'm not sure what license to useplus-square4Robato@lemmy.worldEnglish · edit-21 year agomessage-square6linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agoWhy do LLMs make stuff up? New research peers under the hood.plus-squarearstechnica.comexternal-linkmessage-square0linkfedilinkarrow-up16arrow-down11
arrow-up15arrow-down1external-linkWhy do LLMs make stuff up? New research peers under the hood.plus-squarearstechnica.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agomessage-square0linkfedilink
oba@lemmy.worldEnglish · 1 year agoMLOps tips I gathered recentlyplus-squarewww.readyforagents.comexternal-linkmessage-square0linkfedilinkarrow-up18arrow-down10
arrow-up18arrow-down1external-linkMLOps tips I gathered recentlyplus-squarewww.readyforagents.comoba@lemmy.worldEnglish · 1 year agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agoDeepSeek open source DeepEP – library for MoE training and Inferenceplus-squaregithub.comexternal-linkmessage-square0linkfedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkDeepSeek open source DeepEP – library for MoE training and Inferenceplus-squaregithub.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agoTowards Monosemanticity: Decomposing Language Models With Dictionary Learningplus-squaretransformer-circuits.pubexternal-linkmessage-square0linkfedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkTowards Monosemanticity: Decomposing Language Models With Dictionary Learningplus-squaretransformer-circuits.pub☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agoScaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnetplus-squaretransformer-circuits.pubexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkScaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnetplus-squaretransformer-circuits.pub☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agomessage-square0linkfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agoDeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learningarxiv.orgexternal-linkmessage-square0linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkDeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learningarxiv.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 1 year agomessage-square0linkfedilink