Using unions to simplify casts

Steven Lu picture Steven Lu · May 20, 2010 · Viewed 7.3k times · Source

I realize that what I am trying to do isn't safe. But I am just doing some testing and image processing so my focus here is on speed.

Right now this code gives me the corresponding bytes for a 32-bit pixel value type.

struct Pixel {
    unsigned char b,g,r,a;
};

I wanted to check if I have a pixel that is under a certain value (e.g. r, g, b <= 0x10). I figured I wanted to just conditional-test the bit-and of the bits of the pixel with 0x00E0E0E0 (I could have wrong endianness here) to get the dark pixels.

Rather than using this ugly mess (*((uint32_t*)&pixel)) to get the 32-bit unsigned int value, i figured there should be a way for me to set it up so I can just use pixel.i, while keeping the ability to reference the green byte using pixel.g.

Can I do this? This won't work:

struct Pixel {
    unsigned char b,g,r,a;
};
union Pixel_u {
    Pixel p;
    uint32_t bits;
};

I would need to edit my existing code to say pixel.p.g to get the green color byte. Same happens if I do this:

union Pixel {
    unsigned char c[4];
    uint32_t bits;
};

This would work too but I still need to change everything to index into c, which is a bit ugly but I can make it work with a macro if i really needed to.

Answer

Joseph Quinsey picture Joseph Quinsey · May 20, 2010

(Edited) Both gcc and MSVC allow 'anonymous' structs/unions, which might solve your problem. For example:

union Pixel {
   struct {unsigned char b,g,r,a;};
   uint32_t bits;  // use 'unsigned' for MSVC
}

foo.b = 1;
foo.g = 2;
foo.r = 3;
foo.a = 4;
printf ("%08x\n", foo.bits);

gives (on Intel):

04030201

This requires changing all your declarations of struct Pixel to union Pixel in your original code. But this defect can be fixed via:

struct Pixel {
    union {
        struct {unsigned char b,g,r,a;};
        uint32_t bits; 
    };
} foo;

foo.b = 1;
foo.g = 2;
foo.r = 3;
foo.a = 4;
printf ("%08x\n", foo.bits);

This also works with VC9, with 'warning C4201: nonstandard extension used : nameless struct/union'. Microsoft uses this trick, for example, in:

typedef union {
    struct {
        DWORD LowPart;
        LONG HighPart;
    };  // <-- nameless member!
    struct {
        DWORD LowPart;
        LONG HighPart;
    } u;
    LONGLONG QuadPart;
} LARGE_INTEGER;

but they 'cheat' by suppressing the unwanted warning.

While the above examples are ok, if you use this technique too often, you'll quickly end up with unmaintainable code. Five suggestions to make things clearer:

(1) Change the name bits to something uglier like union_bits, to clearly indicate something out-of-the-ordinary.

(2) Go back to the ugly cast the OP rejected, but hide its ugliness in a macro or in an inline function, as in:

#define BITS(x) (*(uint32_t*)&(x))

But this would break the strict aliasing rules. (See, for example, AndreyT's answer: C99 strict aliasing rules in C++ (GCC).)

(3) Keep the original definiton of Pixel, but do a better cast:

struct Pixel {unsigned char b,g,r,a;} foo;
// ...
printf("%08x\n", ((union {struct Pixel dummy; uint32_t bits;})foo).bits);

(4) But that is even uglier. You can fix this by a typedef:

struct Pixel {unsigned char b,g,r,a;} foo;
typedef union {struct Pixel dummy; uint32_t bits;} CastPixelToBits;
// ...
printf("%08x\n", ((CastPixelToBits)foo).bits);    // not VC9

With VC9, or with gcc using -pedantic, you'll need (don't use this with gcc--see note at end):

printf("%08x\n", ((CastPixelToBits*)&foo)->bits); // VC9 (not gcc)

(5) A macro may perhaps be preferred. In gcc, you can define a union cast to any given type very neatly:

#define CAST(type, x) (((union {typeof(x) src; type dst;})(x)).dst)   // gcc
// ...
printf("%08x\n", CAST(uint32_t, foo));

With VC9 and other compilers, there is no typeof, and pointers may be needed (don't use this with gcc--see note at end):

#define CAST(typeof_x, type, x) (((union {typeof_x src; type dst;}*)&(x))->dst)

Self-documenting, and safer. And not too ugly. All these suggestions are likely to compile to identical code, so efficiency is not an issue. See also my related answer: How to format a function pointer?.

Warning about gcc: The GCC Manual version 4.3.4 (but not version 4.3.0) states that this last example, with &(x), is undefined behaviour. See http://davmac.wordpress.com/2010/01/08/gcc-strict-aliasing-c99/ and http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html.