I really wish AMD would focus on its tooling. The Intel VTune Amplifier is a fantastic multi-platform profiler that can help understand how effectively your software is using the hardware (Pipelining, micro-ops efficiency, cache usage, etc.).
If AMD could come up with something similar, it'd make their offering a no-brainer.
I doubt tooling is the problem that is slowing down AMD adoption. Compiler or special applications writers care about that. These are not the ones who pay AMD for their platform. It's the big businesses - most care about stability, predictability, reliability hence they stay with Intel. Those that are interested mostly in bang/buck, like Cloudflare, choose to go with AMD, tooling or no tooling. You don't need tooling to check that AMD is far more efficient now, you just do some basic load tests and the result is easy to understand.
Unless you're doing low-level stuff no developer is going to be aware that this even exists, much less capable enough to use it. There are more 'programmers' than 'engineers'...
An extreme example of the tooling gap: MKL. Intel made MKL and used CPUID and instruction set optimizations in a non-conformant way that penalizes AMD; it's a fair bit faster than OpenBLAS on (to-date) dominant Intel hardware.
If you make CPUID lie to it that it's an Intel part, it'll use AVX2, and Threadripper parts will beat Xeon by a 20% margin. Otherwise it's 200%+ slower.
Meanwhile, MKL has been a fair bit faster than OpenBLAS, etc on Xeon ("pretend to be Intel" MKL and OpenBLAS are about the same speed on AMD), so all kinds of software is linked against Intel MKL.
When you have core libraries tuned for Intel, either because Intel makes them themselves or because the tooling for performance tuning on AMD hasn't been present, Intel has a software tuning advantage that helps hide AMD's current silicon performance advantage.
This just reminded me of some older article on Numpy performance on Intel vs AMD [1].
I’m not posting this to bash AMD. Just trying to make point that the software side can matter a lot and from end customer point-of-view this can make a huge difference.
Sure, if AMD hardware is much better than Intels, an unoptimized libc might run better there. That’s not the point.
The point is that a libc optimized for AMD could run much better than what it does today.
Improving the performance of low level libraries for Intel hardware is trivial because there are a lot of great tools that help you with that.
If I had to improve the performance of some library on AMD hardware, I would get some Intel hardware and tools, improve the performance for Intel, and then “hope” that this improves the performance for AMD as well.
I think (without too much knowledge in this field) that AMD has had to play catch-up for so long in silicon that now the catch-up is complete, it will take some time to recognise organisationally that more focus on non-silicon is overdue. Silicon lives in an ecosphere, but without competitive products there's little use in investing in that ecosphere. If you suddenly get there, you need to change your game quite drastically.
But wasn’t the same the case that last time they “disrupted” Intel with their amd64 and dual core CPUs? They caught up, but never really invested in the tooling. And then they lost market share again.
There was the whole ftc sued Intel for anti competitive behavior thing last time where they locked oems out of buying Intel and AMD. Hopefully without that things will be more diverse in the market going forward.
I remember back in the day (circa 2000) I was all about AMD and all my custom-built machines had this emphasis on AMD chips due to the floating-point calculations be that much faster than Intel's at the time.
Intel really wiped the floor with them over the next bunch of years... It will be interesting should AMD have their comeback now :)
Oh how the tables turn!
I personally have not used it but AMD does have a cross platform profiler available, AMD uProf. Thought I would post in case it was a situation of not knowing it exists instead of having used it and found it severely lacking vs vTune.
If AMD could come up with something similar, it'd make their offering a no-brainer.
> https://software.intel.com/en-us/vtune