Aligned memory management?

dsimcha picture dsimcha · Feb 21, 2011 · Viewed 17.5k times · Source

I have a few related questions about managing aligned memory blocks. Cross-platform answers would be ideal. However, as I'm pretty sure a cross-platform solution does not exist, I'm mainly interested in Windows and Linux and to a (much) lesser extent Mac OS and FreeBSD.

  1. What's the best way of getting a chunk of memory aligned on 16-byte boundaries? (I'm aware of the trivial method of using malloc(), allocating a little extra space and then bumping the pointer up to a properly aligned value. I'm hoping for something a little less kludge-y, though. Also, see below for additional issues.)

  2. If I use plain old malloc(), allocate extra space, and then move the pointer up to where it would be correctly aligned, is it necessary to keep the pointer to the beginning of the block around for freeing? (Calling free() on pointers to the middle of the block seems to work in practice on Windows, but I'm wondering what the standard says and, even if the standard says you can't, whether it works in practice on all major OS's. I don't care about obscure DS9K-like OS's.)

  3. This is the hard/interesting part. What's the best way to reallocate a memory block while preserving alignment? Ideally this would be something more intelligent than calling malloc(), copying, and then calling free() on the old block. I'd like to do it in place where possible.

Answer

paxdiablo picture paxdiablo · Feb 21, 2011
  1. If your implementation has a standard data type that needs 16-byte alignment (long long for example), malloc already guarantees that your returned blocks will be aligned correctly. Section 7.20.3 of C99 states The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object.

  2. You have to pass back the exact same address into free as you were given by malloc. No exceptions. So yes, you need to keep the original copy.

  3. See (1) above if you already have a 16-byte-alignment-required type.

Beyond that, you may well find that your malloc implementation gives you 16-byte-aligned addresses anyway for efficiency although it's not guaranteed by the standard. If you require it, you can always implement your own allocator.

Myself, I'd implement a malloc16 layer on top of malloc that would use the following structure:

some padding for alignment (0-15 bytes)
size of padding (1 byte)
16-byte-aligned area

Then have your malloc16() function call malloc to get a block 16 bytes larger than requested, figure out where the aligned area should be, put the padding length just before that and return the address of the aligned area.

For free16, you would simply look at the byte before the address given to get the padding length, work out the actual address of the malloc'ed block from that, and pass that to free.

This is untested but should be a good start:

void *malloc16 (size_t s) {
    unsigned char *p;
    unsigned char *porig = malloc (s + 0x10);   // allocate extra
    if (porig == NULL) return NULL;             // catch out of memory
    p = (porig + 16) & (~0xf);                  // insert padding
    *(p-1) = p - porig;                         // store padding size
    return p;
}

void free16(void *p) {
    unsigned char *porig = p;                   // work out original
    porig = porig - *(porig-1);                 // by subtracting padding
    free (porig);                               // then free that
}

The magic line in the malloc16 is p = (porig + 16) & (~0xf); which adds 16 to the address then sets the lower 4 bits to 0, in effect bringing it back to the next lowest alignment point (the +16 guarantees it is past the actual start of the maloc'ed block).

Now, I don't claim that the code above is anything but kludgey. You would have to test it in the platforms of interest to see if it's workable. Its main advantage is that it abstracts away the ugly bit so that you never have to worry about it.