Unions and type-punning

Matthew Wilkins picture Matthew Wilkins · Sep 4, 2014 · Viewed 20k times · Source

I've been searching for a while, but can't find a clear answer.

Lots of people say that using unions to type-pun is undefined and bad practice. Why is this? I can't see any reason why it would do anything undefined considering the memory you write the original information to isn't going to just change of its own accord (unless it goes out of scope on the stack, but that's not a union issue, that would be bad design).

People quote the strict aliasing rule, but that seems to me to be like saying you can't do it because you can't do it.

Also what is the point of a union if not to type pun? I saw somewhere that they are supposed to be used to use the same memory location for different information at different times, but why not just delete the info before using it again?

To summarise:

  1. Why is it bad to use unions for type punning?
  2. What it the point of them if not this?

Extra information: I'm using mainly C++, but would like to know about that and C. Specifically I'm using unions to convert between floats and the raw hex to send via CAN bus.

Answer

Christoph picture Christoph · Sep 4, 2014

To re-iterate, type-punning through unions is perfectly fine in C (but not in C++). In contrast, using pointer casts to do so violates C99 strict aliasing and is problematic because different types may have different alignment requirements and you could raise a SIGBUS if you do it wrong. With unions, this is never a problem.

The relevant quotes from the C standards are:

C89 section 3.3.2.3 §5:

if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined

C11 section 6.5.2.3 §3:

A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member

with the following footnote 95:

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

This should be perfectly clear.


James is confused because C11 section 6.7.2.1 §16 reads

The value of at most one of the members can be stored in a union object at any time.

This seems contradictory, but it is not: In contrast to C++, in C, there is no concept of active member and it's perfectly fine to access the single stored value through an expression of an incompatible type.

See also C11 annex J.1 §1:

The values of bytes that correspond to union members other than the one last stored into [are unspecified].

In C99, this used to read

The value of a union member other than the last one stored into [is unspecified]

This was incorrect. As the annex isn't normative, it did not rate its own TC and had to wait until the next standard revision to get fixed.


GNU extensions to standard C++ (and to C90) do explicitly allow type-punning with unions. Other compilers that don't support GNU extensions may also support union type-punning, but it's not part of the base language standard.