Worse performance using Eigen than using my own class

george picture george · May 31, 2011 · Viewed 16.3k times · Source

A couple of weeks ago I asked a question about the performance of matrix multiplication.

I was told that in order to enhance the performance of my program I should use some specialised matrix classes rather than my own class.

StackOverflow users recommended:

  • uBLAS
  • EIGEN
  • BLAS

At first I wanted to use uBLAS however reading documentation it turned out that this library doesn't support matrix-matrix multiplication.

After all I decided to use EIGEN library. So I exchanged my matrix class to Eigen::MatrixXd - however it turned out that now my application works even slower than before. Time before using EIGEN was 68 seconds and after exchanging my matrix class to EIGEN matrix program runs for 87 seconds.

Parts of program which take the most time looks like that

TemplateClusterBase* TemplateClusterBase::TransformTemplateOne( vector<Eigen::MatrixXd*>& pointVector, Eigen::MatrixXd& rotation ,Eigen::MatrixXd& scale,Eigen::MatrixXd& translation )
{   
    for (int i=0;i<pointVector.size();i++ )
    {
        //Eigen::MatrixXd outcome =
        Eigen::MatrixXd outcome = (rotation*scale)* (*pointVector[i])  + translation;
        //delete  prototypePointVector[i];      // ((rotation*scale)* (*prototypePointVector[i])  + translation).ConvertToPoint();
        MatrixHelper::SetX(*prototypePointVector[i],MatrixHelper::GetX(outcome));
        MatrixHelper::SetY(*prototypePointVector[i],MatrixHelper::GetY(outcome));
        //assosiatedPointIndexVector[i]    = prototypePointVector[i]->associatedTemplateIndex = i;
    }

    return this;
}

and

Eigen::MatrixXd AlgorithmPointBased::UpdateTranslationMatrix( int clusterIndex )
{
    double membershipSum = 0,outcome = 0;
    double currentPower = 0;
    Eigen::MatrixXd outcomePoint = Eigen::MatrixXd(2,1);
    outcomePoint << 0,0;
    Eigen::MatrixXd templatePoint;
    for (int i=0;i< imageDataVector.size();i++)
    {
        currentPower =0; 
        membershipSum += currentPower = pow(membershipMatrix[clusterIndex][i],m);
        outcomePoint.noalias() +=  (*imageDataVector[i] - (prototypeVector[clusterIndex]->rotationMatrix*prototypeVector[clusterIndex]->scalingMatrix* ( *templateCluster->templatePointVector[prototypeVector[clusterIndex]->assosiatedPointIndexVector[i]]) ))*currentPower ;
    }

    outcomePoint.noalias() = outcomePoint/=membershipSum;
    return outcomePoint; //.ConvertToMatrix();
}

As You can see, these functions performs a lot of matrix operations. That is why I thought using Eigen would speed up my application. Unfortunately (as I mentioned above), the program works slower.

Is there any way to speed up these functions?

Maybe if I used DirectX matrix operations I would get better performance ?? (however I have a laptop with integrated graphic card).

Answer

timday picture timday · May 31, 2011

If you're using Eigen's MatrixXd types, those are dynamically sized. You should get much better results from using the fixed size types e.g Matrix4d, Vector4d.

Also, make sure you're compiling such that the code can get vectorized; see the relevant Eigen documentation.

Re your thought on using the Direct3D extensions library stuff (D3DXMATRIX etc): it's OK (if a bit old fashioned) for graphics geometry (4x4 transforms etc), but it's certainly not GPU accelerated (just good old SSE, I think). Also, note that it's floating point precision only (you seem to be set on using doubles). Personally I'd much prefer to use Eigen unless I was actually coding a Direct3D app.