Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To me its not just exhaustiveness but that sum types (enums) are just like product types (structs), they have have member methods, implement traits, etc. Coming from C++, when I realized Rust let me do that, it blew me away.


Actually all three of Rust's user defined types (the sum type enum, the product type struct, and union†) are fully fledged types which can implement traits and have functions of their own (including functions taking a self parameter, thus methods)

C++ unions can actually have methods, although this isn't used very much. However C++ enums can't have methods, even C++ 11 scoped enums ("enum classes") can't have methods, I have no idea why that restriction seemed like a good idea.

† Unions are special because they're crazy dangerous, which is why they're not usually covered in material for learning Rust - you can't fetch from them safely. You can store things in unions safely because the process of storing a value in a union tells the compiler which is the valid representation - the one you're storing to, but fetching is unsafe because you might fetch an inactive representation and that's UB. However Rust does have a particularly obvious union right in the standard library - MaybeUninit - and sure enough MaybeUninit implements Copy and has a bunch of methods.


> C++ enums can't have methods, even C++ 11 scoped enums ("enum classes") can't have methods, I have no idea why that restriction seemed like a good idea

It is possible that Oracle holding the patent[1] to methods on enums is the blocker, rather than any technical restriction.

[1]: https://patents.google.com/patent/US7263687


That sounds bizarre. So this means that parents are potentially holding back programming language innovation? I hope I don’t have to consult with a lawyer every time I invent a new variation on the for loop. (I anticipate that someone will then tell me that there already is a patent for that…)


Why would it be bizarre? It's not exactly a fringe belief that patents in all software are holding back innovation.

I don't think programmers often consult 20 year old "inventions", so it seems pretty obvious on its face that the supposed benefit of patents, that something is _only_ locked up for 20 years, is quite pointless in software.

Anyway, for loops are safe, unless the for loop is over the elements of a linked list. Then you need to wait until next year: https://patents.google.com/patent/US7028023B2


Doesn't that patent's claims cover doubly-linked lists? The "auxilary order" could very well be reverse order. Seems like an obvious prior art.


I am absolutely not a lawyer, but wouldn’t it fall into the “trivial” category, so even if patented, it couldn’t be enforced?


Problem is, unless it has been tried in court, you can't be certain. And if you're building something, you might not want to spend time in court having to fight it in the first place. So even if it's 50/50 enforceable/not enforceable, do you really want to spend the time testing if it is?

Patents really have a chilling effect, even if a particular one might not be enforceable.


Also the ISO is also very averse of patents. They will probably not standardize anything patent-encumbered. They would probably require invalidating the patent first before accepting it in the standard.

Having said that this is the first time I ever heard of methods on enums being patented, what a ridiculous patent. It's a good thing then that C++ doesn't have methods, it has "member functions" :).

Also C++ allows user defined operators on enums, which feels somewhat adjacent.


WG21 (the "C++ Committee") is under JTC1 the Joint Technical Committee ("joint" between ISO and IEC), now, you might know of a few other famous products of the Joint Technical Committee's sub and sub-sub committees, including JPEG (that's the Joint Photographic Experts Group getting a shout out in the name of the standard) and MPEG. Those standards both required patented "inventions" to implement in full. The patents were held by contributors...

In the case of MPEG the result is MPEG LA, a US company which you need to pay to implement certain important standards. In the case of JPEG the result was a little different, since only the improved Arithmetic Coding of JPEG was patented, people just don't implement the actual standard, they cut out the patented part, so all the world's JPEGs (well, mostly JFIF files, which are slightly different but we call them "JPEGs" anyway) are a little bigger than they need to be for no reason except patents.

So no, I don't buy that "ISO is also very averse of patents" in a sense that would restrict this unless you can show that's a new stance.


Yeah, good point. But I think this still applies to WG21.


That expires in 2024, so perhaps the next version of C++ will have it.


I just wish that Rust’s enum Variants were types… maybe someday! :)


It's not exactly what you want but you get most of the benefit by just wrapping types in enum variants.


It's a request for the following to be valid

    enum X {
        A,
        B,
    }
    fn foo() -> X::A {
        X::A
    }
which helps a lot when composing state machines. In the meantime, you can indeed do what you propose:

    enum X {
        A(Foo),
        B(Bar),
    }
    struct Foo;
    struct Bar;
    fn foo() -> Foo {
        Foo
    }


Thanks yeah, I was just meaning for the ergonomics aspect :) Cheers


Since I've heard of this idea, I keep finding places where id use it, whether that be for variant-specific behavior or to remove the redudant type definitions in my API.


A union in C++ is the same thing as a struct, except all of its fields live at the same offset. So you can define any method you want on it, including special stuff like constructors and destructors. No base classes are allowed, though.


A method on a union is much less useful when you can't match on the tag though. I suppose you could store tag inside the union. Is that common? I've always imagined C/C++ tagged unions would store tag outside the union.


You would probably use std::variant in C++ if you want tagged unions.

So you could have a struct or class with one std::variant field and some methods which can match on the type of the variant. But it would be kind of clunky.


C++ allows accessing inactive members in the special case when the active member shares a "common initial sequence" with the particular inactive member.

So yeah, it's possible to store the tag in the common initial sequence of all the union members.

This is quite niche and rarely used. Most of the time it makes more sense to have the tag outside. Sometimes the tag in the common initial sequence allows the whole data structure to pack better than with a tag outside of the union.


At least in C, you cannot store a tag inside the union since all fields of the union live at the same offset (0) they tag will be smashed by the actual value it's trying to talk about.

In C, you have to wrap the union in a struct in order to add the tag, and that pattern is fantastically common. Let's have a little geometry-inspired example:

    typedef enum { SHAPETYPE_RECTANGLE, ... } ShapeType;
    typedef struct { ... } Rectangle;
    typedef struct { ... } Circle;
    typedef struct { ... } Triangle;
    typedef struct { ... } Polygon;

    typedef struct { // Outer struct, not a union at this level.
      ShapeType type;
      union {
        Rectangle rectangle;
        Circle    circle;
        Triangle  triangle;
        Polygon   polygon;
      }  // This can be nameless in new(ish) C, which is nice.
    } Shape;
Then you'd create a value like this, maybe:

    Shape rect = { .type = SHAPETYPE_RECTANGLE, .rectangle = { 0, 0, 20, 10 } };
Of course wrapping the initialization in a function would make it nicer.

The above union-in-a-struct wrapping is my mental model for how enums work in Rust, but I still find it jarring. :)


I'm not certain for C, but definitely in C++ it's legal to union a bunch of structures with a common prefix, and then talk about the prefix in the "wrong" variant and that's OK. There may be some restrictions about exactly what is in that prefix, but at least obvious things like an enum or an integral type will work.

So for your example you put ShapeType type in each of Rectangle, Circle, Triangle etc. and then you can union all of them, and the language promises that shape.circle.type == Rectangle is a reasonable thing to ask, so you can use that to make a discriminated union.


> I'm not certain for C, but definitely in C++ it's legal to union a bunch of structures with a common prefix, and then talk about the prefix in the "wrong" variant and that's OK

That doesn't sound right to me. Do you have a source? Is that in the standard?


I of course do not own a copy of the expensive ISO document, however, in the draft:

11.5.1 [class.union.general]

[Note 1: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence ([class.mem]), and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see [class.mem]. — end note]


I don't know about the standard, but if cppreference.com is good enough: At https://en.cppreference.com/w/cpp/language/union it says "If two union members are standard-layout types, it's well-defined to examine their common subsequence on any compiler."


Yes, totally. Good point, I forgot about that technique. I prefer this, which is less invasive.


i haven't done c++ in a million years, but huh, you can! can't have any types w/ non-trival copy constructors in a union, though, apparently, which is quite the restriction.

https://gist.github.com/erinok/c823af95db408653c7e42ab189307...


> can't have any types w/ non-trival copy constructors in a union, though, apparently, which is quite the restriction.

That was removed in C++11. The rule now is:

> Absent default member initializers ([class.mem]), if any non-static data member of a union has a non-trivial default constructor ([class.default.ctor]), copy constructor, move constructor ([class.copy.ctor]), copy assignment operator, move assignment operator ([class.copy.assign]), or destructor ([class.dtor]), the corresponding member function of the union must be user-provided or it will be implicitly deleted ([dcl.fct.def.delete]) for the union.

i.e., you need to provide an explicit version of the special function for the union if any member has a nontrivial implementation.


ah, good to know, thanks! unfortunately too late to edit my comment


You can have non trivial types in an union, but you need to explicitly define constructor in the union or the containing type (i.e. the default union copy constructor and assignment operator are disabled).


I may be wrong, but I thought it’s that you can’t have any fields with non trivial destructors, because the union wouldn’t know which destructor to call. So POD types / raw pointers / arrays of the above / structs of the above are allowed, and that’s pretty much it.


Yes, it’s very common. The difference is rust enforces via the type system that you match on it.


In fairness to C++, you could derive a class from std::variant and add methods to it that way.

It's less awkward with Rust's enums for sure though. And pattern matching as in Rust is far more expressive (and legible) than what std::variant gives you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: