I have two structs with arrays of bytes and booleans:
using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct struct1
{
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
public byte[] values;
}
[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct struct2
{
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
public bool[] values;
}
And the following code:
class main
{
public static void Main()
{
Console.WriteLine("sizeof array of bytes: "+Marshal.SizeOf(typeof(struct1)));
Console.WriteLine("sizeof array of bools: " + Marshal.SizeOf(typeof(struct2)));
Console.ReadKey();
}
}
That gives me the following output:
sizeof array of bytes: 3
sizeof array of bools: 12
It seems to be that a boolean
takes 4 bytes of storage. Ideally a boolean
would only take one bit (false
or true
, 0
or 1
, etc..).
What is happening here? Is the boolean
type really so inefficient?
The bool type has a checkered history with many incompatible choices between language runtimes. This started with an historical design-choice made by Dennis Ritchie, the guy that invented the C language. It did not have a bool type, the alternative was int where a value of 0 represents false and any other value was considered true.
This choice was carried forward in the Winapi, the primary reason to use pinvoke, it has a typedef for BOOL
which is an alias for the C compiler's int keyword. If you don't apply an explicit [MarshalAs] attribute then a C# bool is converted to a BOOL, thus producing a field that is 4 bytes long.
Whatever you do, your struct declaration needs to be a match with the runtime choice made in the language you interop with. As noted, BOOL for the winapi but most C++ implementations chose byte, most COM Automation interop uses VARIANT_BOOL which is a short.
The actual size of a C# bool
is one byte. A strong design-goal of the CLR is that you cannot find out. Layout is an implementation detail that depends on the processor too much. Processors are very picky about variable types and alignment, wrong choices can significantly affect performance and cause runtime errors. By making the layout undiscoverable, .NET can provide a universal type system that does not depend on the actual runtime implementation.
In other words, you always have to marshal a structure at runtime to nail down the layout. At which time the conversion from the internal layout to the interop layout is made. That can be very fast if the layout is identical, slow when fields need to be re-arranged since that always requires creating a copy of the struct. The technical term for this is blittable, passing a blittable struct to native code is fast because the pinvoke marshaller can simply pass a pointer.
Performance is also the core reason why a bool is not a single bit. There are few processors that make a bit directly addressable, the smallest unit is a byte. An extra instruction is required to fish the bit out of the byte, that doesn't come for free. And it is never atomic.
The C# compiler isn't otherwise shy about telling you that it takes 1 byte, use sizeof(bool)
. This is still not a fantastic predictor for how many bytes a field takes at runtime, the CLR also needs to implement the .NET memory model and it promises that simple variable updates are atomic. That requires variables to be properly aligned in memory so the processor can update it with a single memory-bus cycle. Pretty often, a bool actually requires 4 or 8 bytes in memory because of this. Extra padding that was added to ensure that the next member is aligned properly.
The CLR actually takes advantage of layout being undiscoverable, it can optimize the layout of a class and re-arrange the fields so the padding is minimized. So, say, if you have a class with a bool + int + bool member then it would take 1 + (3) + 4 + 1 + (3) bytes of memory, (3) is the padding, for a total of 12 bytes. 50% waste. Automatic layout rearranges to 1 + 1 + (2) + 4 = 8 bytes. Only a class has automatic layout, structs have sequential layout by default.
More bleakly, a bool can require as many as 32 bytes in a C++ program compiled with a modern C++ compiler that supports the AVX instruction set. Which imposes a 32-byte alignment requirement, the bool variable may end up with 31 bytes of padding. Also the core reason why a .NET jitter does not emit SIMD instructions, unless explicitly wrapped, it can't get the alignment guarantee.