Implementing a complex rotation-based camera

David picture David · Jul 13, 2012 · Viewed 7.3k times · Source

I am implementing a 3D engine for spatial visualisation, and am writing a camera with the following navigation features:

  • Rotate the camera (ie, analogous to rotating your head)
  • Rotate around an arbitrary 3D point (a point in space, which is probably not in the center of the screen; the camera needs to rotate around this keeping the same relative look direction, ie the look direction changes too. This does not look directly at the chosen rotation point)
  • Pan in the camera's plane (so move up/down or left/right in the plane orthogonal to the camera's look vector)

The camera is not supposed to roll - that is, 'up' remains up. Because of this I represent the camera with a location and two angles, rotations around the X and Y axes (Z would be roll.) The view matrix is then recalculated using the camera location and these two angles. This works great for pan and rotating the eye, but not for rotating around an arbitrary point. Instead I get the following behaviour:

  • The eye itself apparently moving further up or down than it should
  • The eye not moving up or down at all when m_dRotationX is 0 or pi. (Gimbal lock? How can I avoid this?)
  • The eye's rotation being inverted (changing the rotation makes it look further up when it should look further down, down when it should look further up) when m_dRotationX is between pi and 2pi.

(a) What is causing this 'drift' in rotation?

This may be gimbal lock. If so, the standard answer to this is 'use quaternions to represent rotation', said many times here on SO (1, 2, 3 for example), but unfortunately without concrete details (example. This is the best answer I've found so far; it's rare.) I've struggled to implemented a camera using quaternions combining the above two types of rotations. I am, in fact, building a quaternion using the two rotations, but a commenter below said there was no reason - it's fine to immediately build the matrix.

This occurs when changing the X and Y rotations (which represent the camera look direction) when rotating around a point, but does not occur simply when directly changing the rotations, i.e. rotating the camera around itself. To me, this doesn't make sense. It's the same values.

(b) Would a different approach (quaternions, for example) be better for this camera? If so, how do I implement all three camera navigation features above?

If a different approach would be better, then please consider providing a concrete implemented example of that approach. (I am using DirectX9 and C++, and the D3DX* library the SDK provides.) In this second case, I will add and award a bounty in a couple of days when I can add one to the question. This might sound like I'm jumping the gun, but I'm low on time and need to implement or solve this quickly (this is a commercial project with a tight deadline.) A detailed answer will also improve the SO archives, because most camera answers I've read so far are light on code.

Thanks for your help :)


Some clarifications

Thanks for the comments and answer so far! I'll try to clarify a few things about the problem:

  • The view matrix is recalculated from the camera position and the two angles whenever one of those things changes. The matrix itself is never accumulated (i.e. updated) - it is recalculated afresh. However, the camera position and the two angle variables are accumulated (whenever the mouse moves, for example, one or both of the angles will have a small amount added or subtracted, based on the number of pixels the mouse moved up-down and/or left-right onscreen.)

  • Commenter JCooper states I'm suffering from gimbal lock, and I need to:

add another rotation onto your transform that rotates the eyePos to be completely in the y-z plane before you apply the transformation, and then another rotation that moves it back afterward. Rotate around the y axis by the following angle immediately before and after applying the yaw-pitch-roll matrix (one of the angles will need to be negated; trying it out is the fastest way to decide which). double fixAngle = atan2(oEyeTranslated.z,oEyeTranslated.x);

Unfortunately, when implementing this as described, my eye shoots off above the scene at a very fast rate due to one of the rotations. I'm sure my code is simply a bad implementation of this description, but I still need something more concrete. In general, I find unspecific text descriptions of algorithms are less useful than commented, explained implementations. I am adding a bounty for a concrete, working example that integrates with the code below (i.e. with the other navigation methods, too.) This is because I would like to understand the solution, as well as have something that works, and because I need to implement something that works quickly since I am on a tight deadline.

Please, if you answer with a text description of the algorithm, make sure it is detailed enough to implement ('Rotate around Y, then transform, then rotate back' may make sense to you but lacks the details to know what you mean. Good answers are clear, signposted, will allow others to understand even with a different basis, are 'solid weatherproof information boards.')

In turn, I have tried to be clear describing the problem, and if I can make it clearer please let me know.


My current code

To implement the above three navigation features, in a mouse move event moving based on the pixels the cursor has moved:

// Adjust this to change rotation speed when dragging (units are radians per pixel mouse moves)
// This is both rotating the eye, and rotating around a point
static const double dRotatePixelScale = 0.001;
// Adjust this to change pan speed (units are meters per pixel mouse moves)
static const double dPanPixelScale = 0.15;

switch (m_eCurrentNavigation) {
    case ENavigation::eRotatePoint: {
        // Rotating around m_oRotateAroundPos
        const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
        const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;

        // To rotate around the point, translate so the point is at (0,0,0) (this makes the point
        // the origin so the eye rotates around the origin), rotate, translate back
        // However, the camera is represented as an eye plus two (X and Y) rotation angles
        // This needs to keep the same relative rotation.

        // Rotate the eye around the point
        const D3DXVECTOR3 oEyeTranslated = m_oEyePos - m_oRotateAroundPos;
        D3DXMATRIX oRotationMatrix;
        D3DXMatrixRotationYawPitchRoll(&oRotationMatrix, dX, dY, 0.0);
        D3DXVECTOR4 oEyeRotated;
        D3DXVec3Transform(&oEyeRotated, &oEyeTranslated, &oRotationMatrix);
        m_oEyePos = D3DXVECTOR3(oEyeRotated.x, oEyeRotated.y, oEyeRotated.z) + m_oRotateAroundPos;

        // Increment rotation to keep the same relative look angles
        RotateXAxis(dX);
        RotateYAxis(dY);
        break;
    }
    case ENavigation::ePanPlane: {
        const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dPanPixelScale;
        const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dPanPixelScale;
        m_oEyePos += GetXAxis() * dX; // GetX/YAxis reads from the view matrix, so increments correctly
        m_oEyePos += GetYAxis() * -dY; // Inverted compared to screen coords
        break;
    }
    case ENavigation::eRotateEye: {
        // Rotate in radians around local (camera not scene space) X and Y axes
        const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
        const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;
        RotateXAxis(dX);
        RotateYAxis(dY);
        break;
    }

The RotateXAxis and RotateYAxis methods are very simple:

void Camera::RotateXAxis(const double dRadians) {
    m_dRotationX += dRadians;
    m_dRotationX = fmod(m_dRotationX, 2 * D3DX_PI); // Keep in valid circular range
}

void Camera::RotateYAxis(const double dRadians) {
    m_dRotationY += dRadians;

    // Limit it so you don't rotate around when looking up and down
    m_dRotationY = std::min(m_dRotationY, D3DX_PI * 0.49); // Almost fully up
    m_dRotationY = std::max(m_dRotationY, D3DX_PI * -0.49); // Almost fully down
}

And to generate the view matrix from this:

void Camera::UpdateView() const {
    const D3DXVECTOR3 oEyePos(GetEyePos());
    const D3DXVECTOR3 oUpVector(0.0f, 1.0f, 0.0f); // Keep up "up", always.

    // Generate a rotation matrix via a quaternion
    D3DXQUATERNION oRotationQuat;
    D3DXQuaternionRotationYawPitchRoll(&oRotationQuat, m_dRotationX, m_dRotationY, 0.0);
    D3DXMATRIX oRotationMatrix;
    D3DXMatrixRotationQuaternion(&oRotationMatrix, &oRotationQuat);

    // Generate view matrix by looking at a point 1 unit ahead of the eye (transformed by the above
    // rotation)
    D3DXVECTOR3 oForward(0.0, 0.0, 1.0);
    D3DXVECTOR4 oForward4;
    D3DXVec3Transform(&oForward4, &oForward, &oRotationMatrix);
    D3DXVECTOR3 oTarget = oEyePos + D3DXVECTOR3(oForward4.x, oForward4.y, oForward4.z); // eye pos + look vector = look target position
    D3DXMatrixLookAtLH(&m_oViewMatrix, &oEyePos, &oTarget, &oUpVector);
}

Answer

JCooper picture JCooper · Jul 16, 2012

It seems to me that "Roll" shouldn't be possible given the way you form your view matrix. Regardless of all the other code (some of which does look a little funny), the call D3DXMatrixLookAtLH(&m_oViewMatrix, &oEyePos, &oTarget, &oUpVector); should create a matrix without roll when given [0,1,0] as an 'Up' vector unless oTarget-oEyePos happens to be parallel to the up vector. This doesn't seem to be the case since you're restricting m_dRotationY to be within (-.49pi,+.49pi).

Perhaps you can clarify how you know that 'roll' is happening. Do you have a ground plane and the horizon line of that ground plane is departing from horizontal?

As an aside, in UpdateView, the D3DXQuaternionRotationYawPitchRoll seems completely unnecessary since you immediately turn around and change it into a matrix. Just use D3DXMatrixRotationYawPitchRoll as you did in the mouse event. Quaternions are used in cameras because they're a convenient way to accumulate rotations happening in eye coordinates. Since you're only using two axes of rotation in a strict order, your way of accumulating angles should be fine. The vector transformation of (0,0,1) isn't really necessary either. The oRotationMatrix should already have those values in the (_31,_32,_33) entries.


Update

Given that it's not roll, here's the problem: you create a rotation matrix to move the eye in world coordinates, but you want the pitch to happen in camera coordinates. Since roll isn't allowed and yaw is performed last, yaw is always the same in both the world and camera frames of reference. Consider the images below:

Local rotation

Your code works fine for local pitch and yaw because those are accomplished in camera coordinates.

Normal pitch around a point

But when you rotate around a reference point, you are creating a rotation matrix that is in world coordinates and using that to rotate the camera center. This works okay if the camera's coordinate system happens to line up with the world's. However, if you don't check to see if you're up against the pitch limit before you rotate the camera position, you will get crazy behavior when you hit that limit. The camera will suddenly start to skate around the world--still 'rotating' around the reference point, but no longer changing orientation.

Locked pitch around a point

If the camera's axes don't line up with the world's, strange things will happen. In the extreme case, the camera won't move at all because you're trying to make it roll.

Off axis pitch would cause roll

The above is what would normally happen, but since you handle the camera orientation separately, the camera doesn't actually roll.

Camera orientation is handled separate from translation

Instead, it stays upright, but you get strange translation going on.

One way to handle this would be to (1)always put the camera into a canonical position and orientation relative to the reference point, (2)make your rotation, and then (3)put it back when you're done (e.g., similar to the way that you translate the reference point to the origin, apply the Yaw-Pitch rotation, and then translate back). Thinking more about it, however, this probably isn't the best way to go.


Update 2

I think that Generic Human's answer is probably the best. The question remains as to how much pitch should be applied if the rotation is off-axis, but for now, we'll ignore that. Maybe it'll give you acceptable results.

The essence of the answer is this: Before mouse movement, your camera is at c1 = m_oEyePos and being oriented by M1 = D3DXMatrixRotationYawPitchRoll(&M_1,m_dRotationX,m_dRotationY,0). Consider the reference point a = m_oRotateAroundPos. From the point of view of the camera, this point is a'=M1(a-c1).

You want to change the orientation of the camera to M2 = D3DXMatrixRotationYawPitchRoll(&M_2,m_dRotationX+dX,m_dRotationY+dY,0). [Important: Since you won't allow m_dRotationY to fall outside of a specific range, you should make sure that dY doesn't violate that constraint.] As the camera changes orientation, you also want its position to rotate around a to a new point c2. This means that a won't change from the perspective of the camera. I.e., M1(a-c1)==M2(a-c2).

So we solve for c2 (remember that the transpose of a rotation matrix is the same as the inverse):

M2TM1(a-c1)==(a-c2) =>

-M2TM1(a-c1)+a==c2

Now if we look at this as a transformation being applied to c1, then we can see that it is first negated, then translated by a, then rotated by M1, then rotated by M2T, negated again, and then translated by a again. These are transformations that graphics libraries are good at and they can all be squished into a single transformation matrix.

@Generic Human deserves credit for the answer, but here's code for it. Of course, you need to implement the function to validate a change in pitch before it's applied, but that's simple. This code probably has a couple typos since I haven't tried to compile:

case ENavigation::eRotatePoint: {
    const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
    double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;
    dY = validatePitch(dY); // dY needs to be kept within bounds so that m_dRotationY is within bounds

    D3DXMATRIX oRotationMatrix1; // The camera orientation before mouse-change
    D3DXMatrixRotationYawPitchRoll(&oRotationMatrix1, m_dRotationX, m_dRotationY, 0.0);

    D3DXMATRIX oRotationMatrix2; // The camera orientation after mouse-change
    D3DXMatrixRotationYawPitchRoll(&oRotationMatrix2, m_dRotationX + dX, m_dRotationY + dY, 0.0);

    D3DXMATRIX oRotationMatrix2Inv; // The inverse of the orientation
    D3DXMatrixTranspose(&oRotationMatrix2Inv,&oRotationMatrix2); // Transpose is the same in this case

    D3DXMATRIX oScaleMatrix; // Negative scaling matrix for negating the translation
    D3DXMatrixScaling(&oScaleMatrix,-1,-1,-1);

    D3DXMATRIX oTranslationMatrix; // Translation by the reference point
    D3DXMatrixTranslation(&oTranslationMatrix,
         m_oRotateAroundPos.x,m_oRotateAroundPos.y,m_oRotateAroundPos.z);

    D3DXMATRIX oTransformMatrix; // The full transform for the eyePos.
    // We assume the matrix multiply protects against variable aliasing
    D3DXMatrixMultiply(&oTransformMatrix,&oScaleMatrix,&oTranslationMatrix);
    D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oRotationMatrix1);
    D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oRotationMatrix2Inv);
    D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oScaleMatrix);
    D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oTranslationMatrix);

    D3DXVECTOR4 oEyeFinal;
    D3DXVec3Transform(&oEyeFinal, &m_oEyePos, &oTransformMatrix);

    m_oEyePos = D3DXVECTOR3(oEyeFinal.x, oEyeFinal.y, oEyeFinal.z) 

    // Increment rotation to keep the same relative look angles
    RotateXAxis(dX);
    RotateYAxis(dY);
    break;
}