I find myself deeply skeptical of much of the open source FPGA movement.
Most of those efforts stem from the underlying notion that “…this is all a problem with the tooling!”
This approaches the problem space from a very software-centric lens. Fundamentally, gateware design isn’t software. It’s wiring together logic gates if you really boil it down to fundamentals. Treating it as a tooling problem is to misconstrue how much you know. Plainly: no open source toolchain is going to have insight into Xilinx’s internal fanout or propagation delay specs. You’re reliant on Xilinx to encode these into their tools for you.
As a result: “Vendor tools are God in FPGA land. You don’t go against God.” (Quoted from the staff FPGA engineer on my team.)
I've found there's a fundamentally different attitude among FPGA engineers compared to software engineers for better or worse.
I think the "vendor tools are god" attitude is overall negative. The vendor tools ARE leagues better than the open source alternatives, but it doesn't mean the open source stuff is just for toy projects.
For example, Vivado is a monolithic pain in the ass. If I want to use an FPGA as electrical super glue for a project, I don't want to be downloading 150GB onto my machine. I think the open source tooling is particularly useful for smaller projects, and the general attitude towards the tooling is really frustrating.
The research that has gone into the RE the bitstream structure and the overall structure of FPGA fabric is extremely impressive. Vendor IPs are often bloated to get you to buy the bigger chip, where open source IPs take up much less of the fabric.
There's give and take but from my perspective a big problem is the tooling.
> The vendor tools ARE leagues better than the open source alternatives, but it doesn't mean the open source stuff is just for toy projects
It’s not a question of developer experience. We heartily agree that using Vivado sucks. My point is that there’s no way around using Vivado (or the Altera equivalent) if you want to use the most useful parts of modern FPGAs.
It’s simply not possible to do things like access the GTH transceivers or custom MAC blocks using open source FPGA tools. These are table stakes capabilities to make these chips useful. You can only use vendor tools to access them.
I suspect - but cannot prove - that Xilinx has agreements with IP vendors about how much they are allowed to reveal about the guts of the different devices they integrate into Zynq family die.
I also suspect that Xilinx has considerable intellectual property investment into the underlying architecture of their PL. Making 600MHz programmable fabric is no mean technical feat. Open sourcing their tools is probably something they judge a risk to revealing those technical advantages.
100% yes. The actual format of the binary that the device interprets is treated as a trade secret, and the vendor tooling is the only documented way to target code at those parts.
This has various implications, such as smaller FPGAs being literally tooling-locked versions of larger ones, important features like partial reconfiguration being supported by the hardware but a huge pain to use from the logic, or vendor tools not supporting some language construct and you're stuck with that tool. (Admittedly they are far and away better than FOSS tooling for language support.)
You can tell the veteran status of FPGA devs by the quality of their rants about the tools. The big FPGA companies have no quality metrics for developer experience. You should be able to make an LED blink within a minute of powering up a board and not after a day of downloading and installing stuff. It used to be possible to quickly start with Vivado on AWS cloud, and I was using that workflow for years, although recent licensing changes presented a speed-bump there, and I ended up going with a local install for my recent project.
Even once you get that LED blinking, changing a clock speed for that blinking LED should be near instantaneous but more likely requires a rebuilding the whole project. Fundamentally the vendors don’t view their chips as something designed to run programs, and this legacy hardware design mentality plagues their whole business.
Something important here: Xilinx could and should have been where NVidia is today. They were certainly aware of the competitive accelerated computing market as early as 2005, and fundamentally failed to make a software architecture competitive with CUDA.
Before CUDA even existed I interned at Xilinx working on the beginnings of their HLS C compiler. My (decade older) fraternity brother led the C compiler team at Altera. We almost went into making a spreadsheet compiler for FPGA (my masters thesis) together but 2007 ended up being a terrible year to sell accelerated computing to Wall Street.
Xilinx never had hardware that was even remotely capable of competing with Nvidia. So I don't think it's solely a software problem - they literally have never developed hardware that is programmable or general purpose enough. Even their versal hardware today is hideously difficult to program and has a very FPGA centric work flow.
This isn’t the full story though, like I (professionally, as a consultant) analyzed GOPs/$ and /Watt for big multi chip GPU or FPGA systems from 2006-2011.
Xilinx routinely had more I/O (SerDes, 100/200/400G MACs on-die) and at times now more HBM bandwidth than contemporary GPUs. Also deterministic latency and perfectly acceptable DSP primitives.
The gap has always been the software.
Of course NVidia wasn’t such an obvious hit either, the flubbed the tablet market due to yield issues and ultimately it really only went exponential in 2014. I invested heavily in NVidia 2007-2014 because of the CUDA edge they had, but sold my $40K of stock at my cost-basis.
I currently do DSP for radar, and implemented the same system on FPGA and in CUDA 2020-2023. I know as a fact that the FFT performance of an $9000 FPGA was equal to a $16000 A100 that also needed a $10000 computer in 2022 (the types on FPGA were fixed point instead of float so no apples-to-apples but definitely application equivalent)
I think you are making the mistake of thinking that xilinx software can fix the programmability of their hardware. it cannot. If you have to solve a place and route problem or do timing closure in your software, you have made a design mistake in your hardware. You cannot design hardware such that a single FFT kernel takes 2 hours to compile and then fails, when nvcc takes 30 seconds and will always succeed. You have taken your software into the domain of RTL design. This is a result of the hardware design. Xilinx could have made their versal hardware a programmable parallel processor array that is cache coherent, where everyone has access to global memory. It fundamentally isn't that though. it's a bizarre data flow graph systemn that requires dma engines and a place and route, and a static fabric configuration. That's a fault of your hardware design!
It depends on what you want to do. FPGAs excel in periodic "always on" workloads that need deterministic timing and low latency. If you don't have that and just care about total throughput and don't care about energy efficiency, then Nvidia will sell you more tflops per chip.
The energy efficiency of FPGAs can't be understated. Reducing the clock and voltage to levels comparable to an FPGA will kill your GPU's tflops and the control overhead and the energy spent on data movement are unavoidable in a GPU.
I didn’t say they were benevolent or forgiving deities. XD
The statement “vendor tools are god” is a statement that they are all powerful, and not something you can work against. It’s not pleasant, but it’s a necessary evil.
You’re not going to be able to access any of the most cutting edge features of Xilinx or Intel chips without the vendor tools. Simple as that. They have no interest in open sourcing the tools. Fighting the vendors to change this is trying to fight a force you can’t fight against and win.
Most of those efforts stem from the underlying notion that “…this is all a problem with the tooling!”
This approaches the problem space from a very software-centric lens. Fundamentally, gateware design isn’t software. It’s wiring together logic gates if you really boil it down to fundamentals. Treating it as a tooling problem is to misconstrue how much you know. Plainly: no open source toolchain is going to have insight into Xilinx’s internal fanout or propagation delay specs. You’re reliant on Xilinx to encode these into their tools for you.
As a result: “Vendor tools are God in FPGA land. You don’t go against God.” (Quoted from the staff FPGA engineer on my team.)