Hacker Newsnew | past | comments | ask | show | jobs | submit | more niemandhier's favoriteslogin
31.Ask HN: Any insider takes on Yann LeCun's push against current architectures?
385 points by vessenes 10 months ago | 325 comments
32.Writing an LLM from scratch, part 8 – trainable self-attention (gilesthomas.com)
380 points by gpjt 10 months ago | 31 comments
33.DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling (github.com/deepseek-ai)
391 points by mfiguiere 11 months ago | 67 comments
34.Part two of Grant Sanderson's video with Terry Tao on the cosmic distance ladder (mathstodon.xyz)
385 points by ColinWright 11 months ago | 94 comments
35.A step-by-step guide to the “World Models” AI paper (applied-data.science)
261 points by datashrimp on April 17, 2018 | 37 comments
36.DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL (arxiv.org)
1351 points by gradus_ad 12 months ago | 1056 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: