best cross-platform method to get aligned memory

user2088790 picture user2088790 · May 4, 2013 · Viewed 11.5k times · Source

Here is the code I normally use to get aligned memory with Visual Studio and GCC

inline void* aligned_malloc(size_t size, size_t align) {
    void *result;
    #ifdef _MSC_VER 
    result = _aligned_malloc(size, align);
    #else 
     if(posix_memalign(&result, align, size)) result = 0;
    #endif
    return result;
}

inline void aligned_free(void *ptr) {
    #ifdef _MSC_VER 
        _aligned_free(ptr);
    #else 
      free(ptr);
    #endif

}

Is this code fine in general? I have also seen people use _mm_malloc, _mm_free. In most cases that I want aligned memory it's to use SSE/AVX. Can I use those functions in general? It would make my code a lot simpler.

Lastly, it's easy to create my own function to align memory (see below). Why then are there so many different common functions to get aligned memory (many of which only work on one platform)?

This code does 16 byte alignment.

float* array = (float*)malloc(SIZE*sizeof(float)+15);

// find the aligned position
// and use this pointer to read or write data into array
float* alignedArray = (float*)(((unsigned long)array + 15) & (~0x0F));

// dellocate memory original "array", NOT alignedArray
free(array);
array = alignedArray = 0;

See: http://www.songho.ca/misc/alignment/dataalign.html and How to allocate aligned memory only using the standard library?

Edit: In case anyone cares, I got the idea for my aligned_malloc() function from Eigen (Eigen/src/Core/util/Memory.h)

Edit: I just discovered that posix_memalign is undefined for MinGW. However, _mm_malloc works for Visual Studio 2012, GCC, MinGW, and the Intel C++ compiler so it seems to be the most convenient solution in general. It also requires using its own _mm_free function, although on some implementations you can pass pointers from _mm_malloc to the standard free / delete.

Answer

As long as you're ok with having to call a special function to do the freeing, your approach is okay. I would do your #ifdefs the other way around though: start with the standards-specified options and fall back to platform-specific ones. For example

  1. If __STDC_VERSION__ >= 201112L use aligned_alloc.
  2. If _POSIX_VERSION >= 200112L use posix_memalign.
  3. If _MSC_VER is defined, use the Windows stuff.
  4. ...
  5. If all else fails, just use malloc/free and disable SSE/AVX code.

The problem is harder if you want to be able to pass the allocated pointer to free; that's valid on all the standard interfaces, but not on Windows and not necessarily with the legacy memalign function some unix-like systems have.