Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The expectation in the HPC community is that an interested vendor will provide their own BLAS/LAPACK implementation (MKL is a BLAS/LAPACK implementation, along with a bunch of other stuff), which is well-tuned for their hardware. These sort of libraries aren't just tuned for an architecture, they might be tuned for a given generation or even particular SKUs.


I learned about this recently when trying to optimize ML test architecture running on Azure. It turns out having access to Ice Lake chips would allow optimizations that should decrease compute time and therefore cost by 20-30%.


Some AVX-512 stuff I guess?

AVX-512 had a rough rollout, but it seems like it is finally turning into something nice.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: