Design Issues in LLVM IR

rwallace · on June 9, 2021

Excellent article! One thing I'm curious about: LLVM uses phi, as was considered best practice at the time. Of late, some people have started using block parameters instead; they are fundamentally equivalent, but make some operations more convenient at the cost of making other operations less convenient. Do the LLVM developers still consider phi the best choice, or would block parameters look better if you had it to do over?

moonchild · on June 10, 2021

Not an LLVM developer, but I've given my thoughts at [1]. The QBE developer agrees fwiw[2], and there is some other interesting discussion in that thread, in particular comparing block parameters to CPS. (Which I think is appropriate for a functional-language compiler, but makes less sense for an imperative-language compiler which must manage mutability and which tends to be less focused on interprocedural optimizations.)

1. https://groups.google.com/g/comp.compilers/c/73kOI_S7r5A. (Sorry for g groups link; comp.compilers website mangles unicode—my fault—and narkive doesn't seem to have the post.)

2. https://www.reddit.com/r/Compilers/comments/5tq6mo/a_common_...

Mathnerd314 · on June 9, 2021

Well, MLIR uses block parameters: https://mlir.llvm.org/docs/Rationale/Rationale/#block-argume...

That could just be a Chris Lattner thing, but he's one of the main LLVM developers, so...

muth02446 · on June 9, 2021

shameless plug: I have been playing with a new IR here: https://github.com/robertmuth/Cwerg

It does simplify the pointer related operations but re-introduces the signed/unsigned notion for integers. I felt this is important because at some point you may want additional notions, e.g. saturating int arithmetic and then you have a choice of either adding more operation, e.g. add_sat, sub_sat, etc or re-use the existing add, sub, etc and interpret it differently based on the operand type.

Mathnerd314 · on June 9, 2021

The point of that section of the article IMO was to say that you always want to err on the side of more operations rather than more types. When canonicalizing, it is easy to reduce operations to their simplest form, but changing the type of an operand is a global operation and hence is a lot more complex. And x86 has mul and imul instructions operating on general-purpose registers, rather than typed registers, so using more operations is closer to the assembly.

ekzhang · on June 9, 2021

I like the typed vs opaque GEP examples! Really illuminating about compiler design.