This isn’t the full story though, like I (professionally, as a consultant) analy...

fooblaster · 2025-08-07T16:45:01 1754585101

I think you are making the mistake of thinking that xilinx software can fix the programmability of their hardware. it cannot. If you have to solve a place and route problem or do timing closure in your software, you have made a design mistake in your hardware. You cannot design hardware such that a single FFT kernel takes 2 hours to compile and then fails, when nvcc takes 30 seconds and will always succeed. You have taken your software into the domain of RTL design. This is a result of the hardware design. Xilinx could have made their versal hardware a programmable parallel processor array that is cache coherent, where everyone has access to global memory. It fundamentally isn't that though. it's a bizarre data flow graph systemn that requires dma engines and a place and route, and a static fabric configuration. That's a fault of your hardware design!