Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Sharding Large Models with Tensor Parallelism (mishalaskin.com)
2 points by tim_sw on April 20, 2023 | past
Training Deep Networks with Data Parallelism in Jax (mishalaskin.com)
122 points by sebg on Feb 24, 2023 | past | 37 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: