What does "Misaligned address error" mean?

Max Yankov picture Max Yankov · Feb 25, 2015 · Viewed 20.2k times · Source

First of all — sorry for the specifics. I generally try to boil my SO questions to generic "class A" stuff with only relevant stuff, but I'm not sure what's the source of the problem here.

I have a matrix class template that looks like this (only showing what I think are the relevant parts):

template <std::size_t R, std::size_t C>
class Matrix
{
private:
    //const int rows, cols;
    std::array<std::array<float,C>,R> m;
public:
    inline std::array<float,C>& operator[](const int i)
    {
        return m[i];
    }

    const std::array<float,C> operator[](const int i) const
    {
        return m[i];
    }

    template<std::size_t N>
    Matrix<R,N> operator *(const Matrix<C,N> a) const
    {
        Matrix<R,N> result = Matrix<R,N>();
        // irrelevant calculation
        return result;
    }
    // ... other very similar stuff, I'm not sure that it's relevant
}

template <std::size_t S>
Matrix<S,S> identity()
{
    Matrix<S,S> matrix = Matrix<S,S>();

    for(std::size_t x = 0; x < S; x++)
    {
        for(std::size_t y = 0; y < S; y++)
        {
            if (x == y)
            {
                matrix[x][y] = 1.f;
            }
        }
    }

    return matrix;
}

I unit tested the whole class, both multiplication and identity factory seem to be working alright. However, then I use it in this method, which gets called a lot of times (I think that if you ever wrote a renderer, it's pretty obvious what I'm trying to do here):

Vec3i Renderer::world_to_screen_space(Vec3f v)
{
    Matrix<4,1> vm = v2m(v);
    Matrix<4,4> projection = identity<4>(); // If I change this to Matrix<4,4>(), the error doesn't happen
    projection[3][2] = -1.f;
    vm = projection * vm;
    Vec3f r = m2v(vm);
    return Vec3i(
            (r.x + 1.) * (width / 2.),
            (r.y + 1.) * (height / 2.),
            r.z
        );
}

And after some amount of time and some amount of random calls to this method, I get this:

Job 1, 'and ./bin/main' terminated by signal SIGBUS (Misaligned address error)

However, if I change the line identity<4>() to Matrix<4,4>() the error doesn't happen. I'm new to C++, so it must be something really stupid.

So, (1) what does this error mean and (2) how did I manage to shoot myself in the leg?

Update: and of course, this bug won't reproduce in the LLDB debugger.

Update 2: here's what I got after running the program through Valgrind:

==66525== Invalid read of size 4
==66525==    at 0x1000148D5: Renderer::draw_triangle(Vec3<float>, Vec3<float>, Vec3<float>, Vec2<int>, Vec2<int>, Vec2<int>, Model, float) (in ./bin/main)

And draw_triangle is exactly the method that calls world_to_screen_space and uses it's result.

Update 3: I discovered the source of the problem, and it wasn't anything related to this code — and it was something pretty obvious, too. Really not sure what to do about this question now.

Answer

shipr picture shipr · Feb 25, 2015

Without a processor that checks for misalignment (as @twalberg says), it is impossible to run and validate the code. But I can say this: it is a common bug in the C++ or other libraries to confuse one type of exception with another type.

My guess -- sorry I can't do more -- is that you are creating allocations that are getting lost, using up your available memory, then overflowing the memory space. The very uncommon exception thrown when you exceed the available memory is probably unexpected and getting returned as a misalignment error. Try checking the memory usage as you run, to determine whether this might be the case.

EDIT:

My guess was wrong, and the valgrind output shows that the Misaligned address error was correct. Running that was a good idea. The clear indication is that there is a bug at a lower level than in your code, so my original idea is almost certainly correct: there is a bug that it is not in your code, but is masked.

Note that it seems there is a difference between the identity() constructor and the Matrix<,> constructor in that the former is initialized along the diagonal (slowly: better would be to eliminate the inner loop) and the latter is not. That might affect the behavior of v2m and m2v.