Each SM can typically schedule 4 warps so it’s more like 400 “cores” each with 1024-bit SIMD instructions. If you look at it this way, they clearly outclass CPU architectures.
This level corresponds to SMT in CPUs I gather. So you can argue your 192 core EPYC server cpu has 384 "vCPUs" since execution resources per core are overprovisioned and when execution blocks waiting for eg memory another thread can run in its place. As Intel and AMD only do 2-way SMT this doesn't make the numbers go up as much.
The single GPU warp is both beefier and wimpier than the SMT thread: they're in-order barely superscalar, whereas on CPU side it's wide superscalar big-window OoO brainiac. But on the other hand the SM has wider SIMD execution resources and there's enough througput for several warps without blocking.
A major difference is how the execution resources are tuned to the expected workloads. CPU's run application code that likes big low latency caches and high single thread performance on branchy integer code, but it doesn't pay to put in execution resources for maximizing AVX-512 FP math instructions per cycle or increasing memory bandwidth indefinitely.
Yep. But from the point of view of running CPU-style code on GPUs (eg Rust std lib) and how the "thousands of cores" fiction relates those are less relevant.
And for GenAI matrix math there's of course all the non-gpu acceleration features in various shapes and forms, like the on-chip edge tpu on G phones or Intel and Apple's name things that are both called AMX.
You think the appropriate punishment for interfering with a simple administrative act is gunshots to the back of the head? Are you even reading what you're saying???
Police have the right to defend themselves if they fear for their lives. It was terrible accident indeed that could have been voided if he'd not physically interfere or have a gun on him.
Which again, is obviously the wrong answer, as that same argument could try to be applied to Windows and would fall immediately: Windows 95 knows nothing of my new hardware, and yet, by and large, works fine. There is something unique about macOS and Apple that causes their hardware to actively not bother to maintain any form of backwards compatibility with the software that runs on it (which is not to be unexpected from Apple, but still), and that must be present in the answer to this question (which is done really well by mschuster91).
reply