To me its not just exhaustiveness but that sum types (enums) are just like product types (structs), they have have member methods, implement traits, etc. Coming from C++, when I realized Rust let me do that, it blew me away.
Actually all three of Rust's user defined types (the sum type enum, the product type struct, and union†) are fully fledged types which can implement traits and have functions of their own (including functions taking a self parameter, thus methods)
C++ unions can actually have methods, although this isn't used very much. However C++ enums can't have methods, even C++ 11 scoped enums ("enum classes") can't have methods, I have no idea why that restriction seemed like a good idea.
† Unions are special because they're crazy dangerous, which is why they're not usually covered in material for learning Rust - you can't fetch from them safely. You can store things in unions safely because the process of storing a value in a union tells the compiler which is the valid representation - the one you're storing to, but fetching is unsafe because you might fetch an inactive representation and that's UB. However Rust does have a particularly obvious union right in the standard library - MaybeUninit - and sure enough MaybeUninit implements Copy and has a bunch of methods.
> C++ enums can't have methods, even C++ 11 scoped enums ("enum classes") can't have methods, I have no idea why that restriction seemed like a good idea
It is possible that Oracle holding the patent[1] to methods on enums is the blocker, rather than any technical restriction.
That sounds bizarre. So this means that parents are potentially holding back programming language innovation? I hope I don’t have to consult with a lawyer every time I invent a new variation on the for loop. (I anticipate that someone will then tell me that there already is a patent for that…)
Why would it be bizarre? It's not exactly a fringe belief that patents in all software are holding back innovation.
I don't think programmers often consult 20 year old "inventions", so it seems pretty obvious on its face that the supposed benefit of patents, that something is _only_ locked up for 20 years, is quite pointless in software.
Problem is, unless it has been tried in court, you can't be certain. And if you're building something, you might not want to spend time in court having to fight it in the first place. So even if it's 50/50 enforceable/not enforceable, do you really want to spend the time testing if it is?
Patents really have a chilling effect, even if a particular one might not be enforceable.
Also the ISO is also very averse of patents. They will probably not standardize anything patent-encumbered. They would probably require invalidating the patent first before accepting it in the standard.
Having said that this is the first time I ever heard of methods on enums being patented, what a ridiculous patent. It's a good thing then that C++ doesn't have methods, it has "member functions" :).
Also C++ allows user defined operators on enums, which feels somewhat adjacent.
WG21 (the "C++ Committee") is under JTC1 the Joint Technical Committee ("joint" between ISO and IEC), now, you might know of a few other famous products of the Joint Technical Committee's sub and sub-sub committees, including JPEG (that's the Joint Photographic Experts Group getting a shout out in the name of the standard) and MPEG. Those standards both required patented "inventions" to implement in full. The patents were held by contributors...
In the case of MPEG the result is MPEG LA, a US company which you need to pay to implement certain important standards. In the case of JPEG the result was a little different, since only the improved Arithmetic Coding of JPEG was patented, people just don't implement the actual standard, they cut out the patented part, so all the world's JPEGs (well, mostly JFIF files, which are slightly different but we call them "JPEGs" anyway) are a little bigger than they need to be for no reason except patents.
So no, I don't buy that "ISO is also very averse of patents" in a sense that would restrict this unless you can show that's a new stance.
Since I've heard of this idea, I keep finding places where id use it, whether that be for variant-specific behavior or to remove the redudant type definitions in my API.
A union in C++ is the same thing as a struct, except all of its fields live at the same offset. So you can define any method you want on it, including special stuff like constructors and destructors. No base classes are allowed, though.
A method on a union is much less useful when you can't match on the tag though. I suppose you could store tag inside the union. Is that common? I've always imagined C/C++ tagged unions would store tag outside the union.
You would probably use std::variant in C++ if you want tagged unions.
So you could have a struct or class with one std::variant field and some methods which can match on the type of the variant. But it would be kind of clunky.
C++ allows accessing inactive members in the special case when the active member shares a "common initial sequence" with the particular inactive member.
So yeah, it's possible to store the tag in the common initial sequence of all the union members.
This is quite niche and rarely used. Most of the time it makes more sense to have the tag outside. Sometimes the tag in the common initial sequence allows the whole data structure to pack better than with a tag outside of the union.
At least in C, you cannot store a tag inside the union since all fields of the union live at the same offset (0) they tag will be smashed by the actual value it's trying to talk about.
In C, you have to wrap the union in a struct in order to add the tag, and that pattern is fantastically common. Let's have a little geometry-inspired example:
typedef enum { SHAPETYPE_RECTANGLE, ... } ShapeType;
typedef struct { ... } Rectangle;
typedef struct { ... } Circle;
typedef struct { ... } Triangle;
typedef struct { ... } Polygon;
typedef struct { // Outer struct, not a union at this level.
ShapeType type;
union {
Rectangle rectangle;
Circle circle;
Triangle triangle;
Polygon polygon;
} // This can be nameless in new(ish) C, which is nice.
} Shape;
I'm not certain for C, but definitely in C++ it's legal to union a bunch of structures with a common prefix, and then talk about the prefix in the "wrong" variant and that's OK. There may be some restrictions about exactly what is in that prefix, but at least obvious things like an enum or an integral type will work.
So for your example you put ShapeType type in each of Rectangle, Circle, Triangle etc. and then you can union all of them, and the language promises that shape.circle.type == Rectangle is a reasonable thing to ask, so you can use that to make a discriminated union.
> I'm not certain for C, but definitely in C++ it's legal to union a bunch of structures with a common prefix, and then talk about the prefix in the "wrong" variant and that's OK
That doesn't sound right to me. Do you have a source? Is that in the standard?
I of course do not own a copy of the expensive ISO document, however, in the draft:
11.5.1 [class.union.general]
[Note 1: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence ([class.mem]), and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see [class.mem].
— end note]
I don't know about the standard, but if cppreference.com is good enough: At https://en.cppreference.com/w/cpp/language/union it says "If two union members are standard-layout types, it's well-defined to examine their common subsequence on any compiler."
i haven't done c++ in a million years, but huh, you can! can't have any types w/ non-trival copy constructors in a union, though, apparently, which is quite the restriction.
> can't have any types w/ non-trival copy constructors in a union, though, apparently, which is quite the restriction.
That was removed in C++11. The rule now is:
> Absent default member initializers ([class.mem]), if any non-static data member of a union has a non-trivial default constructor ([class.default.ctor]), copy constructor, move constructor ([class.copy.ctor]), copy assignment operator, move assignment operator ([class.copy.assign]), or destructor ([class.dtor]), the corresponding member function of the union must be user-provided or it will be implicitly deleted ([dcl.fct.def.delete]) for the union.
i.e., you need to provide an explicit version of the special function for the union if any member has a nontrivial implementation.
You can have non trivial types in an union, but you need to explicitly define constructor in the union or the containing type (i.e. the default union copy constructor and assignment operator are disabled).
I may be wrong, but I thought it’s that you can’t have any fields with non trivial destructors, because the union wouldn’t know which destructor to call. So POD types / raw pointers / arrays of the above / structs of the above are allowed, and that’s pretty much it.
In fairness to C++, you could derive a class from std::variant and add methods to it that way.
It's less awkward with Rust's enums for sure though. And pattern matching as in Rust is far more expressive (and legible) than what std::variant gives you.