A union in C++ is the same thing as a struct, except all of its fields live at t...

nicoburns · on March 14, 2023

A method on a union is much less useful when you can't match on the tag though. I suppose you could store tag inside the union. Is that common? I've always imagined C/C++ tagged unions would store tag outside the union.

ivmaykov · on March 14, 2023

You would probably use std::variant in C++ if you want tagged unions.

So you could have a struct or class with one std::variant field and some methods which can match on the type of the variant. But it would be kind of clunky.

planede · on March 14, 2023

C++ allows accessing inactive members in the special case when the active member shares a "common initial sequence" with the particular inactive member.

So yeah, it's possible to store the tag in the common initial sequence of all the union members.

This is quite niche and rarely used. Most of the time it makes more sense to have the tag outside. Sometimes the tag in the common initial sequence allows the whole data structure to pack better than with a tag outside of the union.

unwind · on March 14, 2023

At least in C, you cannot store a tag inside the union since all fields of the union live at the same offset (0) they tag will be smashed by the actual value it's trying to talk about.

In C, you have to wrap the union in a struct in order to add the tag, and that pattern is fantastically common. Let's have a little geometry-inspired example:

    typedef enum { SHAPETYPE_RECTANGLE, ... } ShapeType;
    typedef struct { ... } Rectangle;
    typedef struct { ... } Circle;
    typedef struct { ... } Triangle;
    typedef struct { ... } Polygon;

    typedef struct { // Outer struct, not a union at this level.
      ShapeType type;
      union {
        Rectangle rectangle;
        Circle    circle;
        Triangle  triangle;
        Polygon   polygon;
      }  // This can be nameless in new(ish) C, which is nice.
    } Shape;

Then you'd create a value like this, maybe:

    Shape rect = { .type = SHAPETYPE_RECTANGLE, .rectangle = { 0, 0, 20, 10 } };

Of course wrapping the initialization in a function would make it nicer.

The above union-in-a-struct wrapping is my mental model for how enums work in Rust, but I still find it jarring. :)

tialaramex · on March 14, 2023

I'm not certain for C, but definitely in C++ it's legal to union a bunch of structures with a common prefix, and then talk about the prefix in the "wrong" variant and that's OK. There may be some restrictions about exactly what is in that prefix, but at least obvious things like an enum or an integral type will work.

So for your example you put ShapeType type in each of Rectangle, Circle, Triangle etc. and then you can union all of them, and the language promises that shape.circle.type == Rectangle is a reasonable thing to ask, so you can use that to make a discriminated union.

staunton · on March 14, 2023

> I'm not certain for C, but definitely in C++ it's legal to union a bunch of structures with a common prefix, and then talk about the prefix in the "wrong" variant and that's OK

That doesn't sound right to me. Do you have a source? Is that in the standard?

tialaramex · on March 14, 2023

I of course do not own a copy of the expensive ISO document, however, in the draft:

11.5.1 [class.union.general]

[Note 1: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence ([class.mem]), and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see [class.mem]. — end note]

unnah · on March 14, 2023

I don't know about the standard, but if cppreference.com is good enough: At https://en.cppreference.com/w/cpp/language/union it says "If two union members are standard-layout types, it's well-defined to examine their common subsequence on any compiler."

unwind · on March 14, 2023

Yes, totally. Good point, I forgot about that technique. I prefer this, which is less invasive.

dilap · on March 14, 2023

i haven't done c++ in a million years, but huh, you can! can't have any types w/ non-trival copy constructors in a union, though, apparently, which is quite the restriction.

https://gist.github.com/erinok/c823af95db408653c7e42ab189307...

jcranmer · on March 14, 2023

> can't have any types w/ non-trival copy constructors in a union, though, apparently, which is quite the restriction.

That was removed in C++11. The rule now is:

> Absent default member initializers ([class.mem]), if any non-static data member of a union has a non-trivial default constructor ([class.default.ctor]), copy constructor, move constructor ([class.copy.ctor]), copy assignment operator, move assignment operator ([class.copy.assign]), or destructor ([class.dtor]), the corresponding member function of the union must be user-provided or it will be implicitly deleted ([dcl.fct.def.delete]) for the union.

i.e., you need to provide an explicit version of the special function for the union if any member has a nontrivial implementation.

dilap · on March 14, 2023

ah, good to know, thanks! unfortunately too late to edit my comment

gpderetta · on March 14, 2023

You can have non trivial types in an union, but you need to explicitly define constructor in the union or the containing type (i.e. the default union copy constructor and assignment operator are disabled).

ivmaykov · on March 14, 2023

I may be wrong, but I thought it’s that you can’t have any fields with non trivial destructors, because the union wouldn’t know which destructor to call. So POD types / raw pointers / arrays of the above / structs of the above are allowed, and that’s pretty much it.

umanwizard · on March 14, 2023

Yes, it’s very common. The difference is rust enforces via the type system that you match on it.