Measure size of object using reference object in photo

Light Yagmi picture Light Yagmi · Feb 22, 2016 · Viewed 10.5k times · Source

I'd like to calculate size of object in a photo which includes both a target object and a reference one.

I think what I'd like to do is what this software achieves (I don't know how precise this software is) https://itunes.apple.com/us/app/photo-meter-picture-measuring/id579961082?mt=8

I've already found, in general, it's called photogrammetry and seems active research field.

How would you find the height of objects given an image? https://physics.stackexchange.com/questions/151121/can-i-calculate-the-size-of-a-real-object-by-just-looking-at-the-picture-taken-b

But, I can not find

  • what is basic way to measure a object in a photo with a reference object.
  • a way to implement it or standard open source for it.

Update

  • I can not utilize the distance of the object and reference from the camera.
  • The reference and target are on the (approximately) same plane.

Answer

Alessandro Jacopson picture Alessandro Jacopson · Apr 3, 2016

Due to your assumption The reference and target are on the (approximately) same plane. you can apply the method "Algorithm 1: planar measurements" described in

Antonio Criminisi. "Single-View Metrology: Algorithms and Applications (Invited Paper)". In: Pattern Recognition. Ed. by Luc Van Gool. Vol. 2449. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2002, pp. 224-239.

The method allows you to measure the distance between two points that lie in the same plane.

Basically

P=H*p (1)

where p is a point in your image expressed in homogeneous coordinates, P is the corresponding point in the 3D world plane, also expressed in homogeneous coordinates, and H is a 3x3 matrix called homography matrix and * is the matrix-vector multiplication.

    h11 h12 h13
H = h21 h22 h23
    h31 h32 h33

The unit of measures of p are pixels, so for example if p is a point at row r and column c it will be expressed as [r,c,1]. The unit of measure of P are your world units, for example meters, you can assume that your 3D world plane is the plane Z=0 and so P is expressed as the homogeneous vector [X,Y,1].

So a little modification to the "Algorithm 1: planar measurements." is the following:

  1. Given an image of a planar surface estimate the image-to-world homography matrix H. Assume that the 9 elements of H are dimensionless.

  2. In the image, select two points p1=[r1,c1,1] and p2=[r2,c2,1] belonging to the reference object.

  3. Back-project each image point into the world plane via (1) to obtain the two world points P1 and P2. You do the matrix-vector multiplication and then you divide the resulting vector for its third component in order to get an homogeneous vector. For example P1=[X1,Y1,1] is P1=[(c1*h_12 + h_11*r1 + h_13)/(c1*h_32 + h_31*r1 + h_33),(c1*h_22 + h_21*r1 + h_23)/(c1*h_32 + h_31*r1 + h_33),1]. Assume for the moment that the nine elements of H are dimensionless, that means that the unit of measure of X1, Y1, X2, Y2 is pixel.

  4. Compute the distance R between P1 and P2 that is R=sqrt(pow(X1-X2,2)+pow(Y1-Y2,2), R is still expressed in pixel. Now, since P1 and P2 are on the reference object it means that you know the distance between them in meters, let's call that distance, expressed in meters, M.

  5. Compute the scale factor s as s=M/R, the dimension of s is meter per pixel.

  6. Multiply each element of H by s and call G the new matrix you get. Now the elements of G are expressed in meter per pixel.

  7. Now, in the image select two points p3 and p4 belonging to the target object.

  8. Back-project p3 and p4 via G in order to get P3 and P4. P3=G*p3 and P4=G*p4. Again divide each vectors by its third element. P3=[X3,Y3,1] and P4=[X4,Y4,1] and now X3, Y3, X4 and Y4 are expressed in meters.

  9. Compute the desired target distance D between P3 and P4 that is D=sqrt(pow(X3-X4,2)+pow(Y3-Y4,2). D is now expressed in meters.

The appendix of the above mentioned paper explains how to compute H or for example you can use OpenCV cv::findHomography: basically you need at least four correspondences between points in the real world and points in your image.

Another source of information on how to estimate H is in

JOHNSON, Micah K.; FARID, Hany. Metric measurements on a plane from a single image. Dept. Comput. Sci., Dartmouth College, Tech. Rep. TR2006-579, 2006.

If you need also to estimate the accuracy of your measurements you can find the details in

A. Criminisi. Accurate Visual Metrology from Single and Multiple Uncalibrated Images. Distinguished Dissertation Series. Springer-Verlag London Ltd., Sep 2001. ISBN: 1852334681.

An example in C++ with OpenCV:

#include "opencv2/core/core.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/calib3d/calib3d.hpp"


void to_homogeneous(const std::vector< cv::Point2f >& non_homogeneous, std::vector< cv::Point3f >& homogeneous )
{
    homogeneous.resize(non_homogeneous.size());
    for ( size_t i = 0; i < non_homogeneous.size(); i++ ) {
        homogeneous[i].x = non_homogeneous[i].x;
        homogeneous[i].y = non_homogeneous[i].y;
        homogeneous[i].z = 1.0;
    }
}

void from_homogeneous(const std::vector< cv::Point3f >& homogeneous, std::vector< cv::Point2f >& non_homogeneous )
{
    non_homogeneous.resize(homogeneous.size());
    for ( size_t i = 0; i < non_homogeneous.size(); i++ ) {
        non_homogeneous[i].x = homogeneous[i].x / homogeneous[i].z;
        non_homogeneous[i].y = homogeneous[i].y / homogeneous[i].z;
    }
}

void draw_cross(cv::Mat &img, const cv::Point center, float arm_length, const cv::Scalar &color, int thickness = 5 )
{
    cv::Point N(center - cv::Point(0, arm_length));
    cv::Point S(center + cv::Point(0, arm_length));
    cv::Point E(center + cv::Point(arm_length, 0));
    cv::Point W(center - cv::Point(arm_length, 0));
    cv::line(img, N, S, color, thickness);
    cv::line(img, E, W, color, thickness);
}

double measure_distance(const cv::Point2f& p1, const cv::Point2f& p2, const cv::Matx33f& GG)
{
    std::vector< cv::Point2f > ticks(2);
    ticks[0] = p1;
    ticks[1] = p2;
    std::vector< cv::Point3f > ticks_h;
    to_homogeneous(ticks, ticks_h);

    std::vector< cv::Point3f > world_ticks_h(2);
    for ( size_t i = 0; i < ticks_h.size(); i++ ) {
        world_ticks_h[i] = GG * ticks_h[i];
    }
    std::vector< cv::Point2f > world_ticks_back;
    from_homogeneous(world_ticks_h, world_ticks_back);

    return cv::norm(world_ticks_back[0] - world_ticks_back[1]);
}

int main(int, char**)
{
    cv::Mat img = cv::imread("single-view-metrology.JPG");
    std::vector< cv::Point2f > world_tenth_of_mm;
    std::vector< cv::Point2f > img_px;

    // Here I manually picked the pixels coordinates of the corners of the A4 sheet.
    cv::Point2f TL(711, 64);
    cv::Point2f BL(317, 1429);
    cv::Point2f TR(1970, 175);
    cv::Point2f BR(1863, 1561);

    // This is the standard size of the A4 sheet:
    const int A4_w_mm = 210;
    const int A4_h_mm = 297;
    const int scale = 10;

    // Here I create the correspondences between the world point and the
    // image points.
    img_px.push_back(TL);
    world_tenth_of_mm.push_back(cv::Point2f(0.0, 0.0));

    img_px.push_back(TR);
    world_tenth_of_mm.push_back(cv::Point2f(A4_w_mm * scale, 0.0));

    img_px.push_back(BL);
    world_tenth_of_mm.push_back(cv::Point2f(0.0, A4_h_mm * scale));

    img_px.push_back(BR);
    world_tenth_of_mm.push_back(cv::Point2f(A4_w_mm * scale, A4_h_mm * scale));

    // Here I estimate the homography that brings the world to the image.
    cv::Mat H = cv::findHomography(world_tenth_of_mm, img_px);

    // To back-project the image points into the world I need the inverse of the homography.
    cv::Mat G = H.inv();

    // I can rectify the image.
    cv::Mat warped;
    cv::warpPerspective(img, warped, G, cv::Size(2600, 2200 * 297 / 210));

    {
        // Here I manually picked the pixels coordinates of ticks '0' and '1' in the slide rule,
        // in the world the distance between them is 10mm.
        cv::Point2f tick_0(2017, 1159);
        cv::Point2f tick_1(1949, 1143);
        // I measure the distance and I write it on the image.
        std::ostringstream oss;
        oss << measure_distance(tick_0, tick_1, G) / scale;
        cv::line(img, tick_0, tick_1, CV_RGB(0, 255, 0));
        cv::putText(img, oss.str(), (tick_0 + tick_1) / 2, cv::FONT_HERSHEY_PLAIN, 3, CV_RGB(0, 255, 0), 3);
    }

    {
        // Here I manually picked the pixels coordinates of ticks '11' and '12' in the slide rule,
        // in the world the distance between them is 10mm.
        cv::Point2f tick_11(1277, 988);
        cv::Point2f tick_12(1211, 973);
        // I measure the distance and I write it on the image.
        std::ostringstream oss;
        oss << measure_distance(tick_11, tick_12, G) / scale;
        cv::line(img, tick_11, tick_12, CV_RGB(0, 255, 0));
        cv::putText(img, oss.str(), (tick_11 + tick_12) / 2, cv::FONT_HERSHEY_PLAIN, 3, CV_RGB(0, 255, 0), 3);
    }

    // I draw the points used in the estimate of the homography.
    draw_cross(img, TL, 40, CV_RGB(255, 0, 0));
    draw_cross(img, TR, 40, CV_RGB(255, 0, 0));
    draw_cross(img, BL, 40, CV_RGB(255, 0, 0));
    draw_cross(img, BR, 40, CV_RGB(255, 0, 0));

    cv::namedWindow( "Input image", cv::WINDOW_NORMAL );
    cv::imshow( "Input image", img );
    cv::imwrite("img.png", img);

    cv::namedWindow( "Rectified image", cv::WINDOW_NORMAL );
    cv::imshow( "Rectified image", warped );
    cv::imwrite("warped.png", warped);

    cv::waitKey(0);

    return 0;
}

Input image, in this case your reference object is the A4 sheet and the target object is the slide rule: enter image description here

Input image with measures, the red crosses are used to estimate the homography: enter image description here

The rectified image: enter image description here