OpenCV does not provide a RANSAC-function per se or at least in such a form that you can just call it and be done with it (e.g. cv::ransac(...)
). All functions/methods that are able to use RANSAC have a flag that enables it. However this is not always useful if you actually want to do something else with the inliers RANSAC computes after you have estimated a homography/fundamental matrix for example create a nice plot in Octave or similar software/library of the points, apply additional algorithms on the remaining set of filtered matches etc.
After matching two images one gets a vector of matches. Along with that we have of course 2 sets of keypoints (one for each image) that were used in the matching process. Using matches and keypoints we create two vectors of points (e.g. cv::Point2f points
) and pass these to findHomography()
. From this and this posts I discovered how exactly the inliers are marked using a mask, that we pass to that function. Each row inside the mask relates to an inlier/outlier. However I am unable to figure out how to use the row-index information from my two sets of points. Looking at OpenCV's source code didn't get me too far. In findFundamental()
(similar to findHomography()
when it comes to its signature and the mask-part) they use compressPoints()
, which seems to somehow combine the two sets we have as input (source and destination points) into one. While testing in order to determine the nature of the mask I tried 2 sets of matched points (converted cv::Keypoints
to cv::Point2f
- a standard procedure). Each set contains 300 points so in total we have 600 points. The returned mask contains 300 rows (values are not important for this topic at hand).
EDIT: While writing this I discovered the answer (see below) but decided to post this question anyway in case someone needs this information as soon as possible and in compact form. Note that we still need one of OpenCV's function, which support RANSAC. So if you have a set of points but no intention of computing homography or fundamental matrix, this is obviously not the way and I dare say that I was unable to find anything useful in OpenCV's API that can help avoid this obstacle therefore you need to use an external library.
The solution is actually quite trivial. As we know each row in our mask gives information if we have an inlier or an outlier. However we have 2 sets of points as input so how exactly does a row containing a single value represent two points? The nature of this sort of indexing appeared in my mind while thinking how actually those two sets of points appear in findHomography() (in my case I was computing the homography between two images). Both sets have equal number of points in them because of the simple fact that they are extracted from the matches between our pair of images. This means that a row in our mask is the actual index of the points in the two sets and also the index in the vector of matches for the two images. I have successfully managed to manually refer to a small subset of matched points based on this and the results are as expected. It is important that you don't alter the order of your matches and the 2D points you have extracted from them using the keypoints referenced in each cv::DMatch
. Below you can see a simple example for a single pair of inliers.
for(int i = 0; i < matchesObjectScene.size(); ++i)
{
// extract points from keypoints based on matches
pointsObject.push_back(keypointsObject.at(matchesObjectScene.at(i).queryIdx).pt);
pointsScene.push_back(keypointsScene.at(matchesObjectScene.at(i).trainIdx).pt);
}
// compute homography using RANSAC
cv::Mat mask;
cv::Mat H = cv::findHomography(pointsObject, pointsScene, CV_RANSAC, ransacThreshold, mask);
In the example above if we print some inlier
int maskRow = 10;
std::cout << "POINTS: object(" << pointsObject.at(maskRow).x << "," << pointsObject.at(maskRow).y << ") - scene(" << pointsScene.at(maskRow).x << "," << pointsScene.at(maskRow).y << ")" << std::endl;
and then again but this time using our keypoints (can also be done with the extracted 2D points)
std::cout << "POINTS (via match-set): object(" << keypointsObject.at(matchesCurrentObject.at(maskRow).queryIdx).pt.x << "," << keypointsObject.at(matchesCurrentObject.at(maskRow).queryIdx).pt.y << ") - scene(" << keypointsScene.at(matchesCurrentObject.at(maskRow).trainIdx).pt.x << "," << keypointsScene.at(matchesCurrentObject.at(maskRow).trainIdx).pt.y << ")" << std::endl;
we actually get the same output:
POINTS: object(462,199) - sscene(485,49)
POINTS (via match-set): object(462,199) - scene(485,49)
To get the actual inlier we simply have to check if the current row in the mask actually contains a 0 or non-zero value:
if((unsigned int)mask.at<uchar>(maskRow))
// store match or keypoints or points somewhere where you can access them later