How to convert fingerprint to unique id to use it for searching in database?

ss.5 picture ss.5 · May 14, 2016 · Viewed 14.1k times · Source

I have fingerprint reader secugen and I have been able to get my application to control the fingerprint device by scanning fingerprints and I have been able to save them to mysql database!

After saving the fingerprints to the db, I now want to search for the user using fingerprint, and I can't search using new fingerprint because the fingerprint saved as blob/img. so I need to convert fingerprint to unique id to use it for searching in database?

I have mysql database with 9,000,000 user. Now I can get any user information by using user's (username)

SELECT USERS FROM members WHERE username=username_var

But now I can't use WHERE in my query because I have fingerprint template which is instead of (username) and the fingerprint will be changed every time, so i can't use the fingerprint in my query like (username) when i use WHERE in my query.

All fingerprint SDK have functions can help me with this but they are not so fast and take 7 minutes to search and that's a very long time.

I do not know what to do and how. I hope that you you understand my problem

Answer

LSerni picture LSerni · May 14, 2016

Summary

Either your SDK provides a way of transforming a fingerprint taken from its sensor into a string suitable for relaxed/approximate pattern matching using regular expressions, or into a binary bit vector of fixed size suitable for binary matching; or you need to find a library to do this conversion yourself. All other cases, while feasible in theory, simply aren't practical.

You cannot do anything with just the images.

And in this case, the Secugen SDK only allows access to image (for diagnostic purposes, I imagine) and it needs to run the check itself (and you want the 1:N kit for that; the 1:1 kit won't do). If you still want to pursue this, I'll include a suggestion at the bottom. Not really suggested, mind you.


Boring details

Fingerprints look too much alike one another to be amenable to standard image search. Even worse, the same fingerprint from the same person will never look the same in any two readings. Different pressure, speed, direction, environment temperature, sensor and skin moisture level, will lead to different images. Basically, you can do nothing until and unless you convert your fingerprint to a "feature vector".

There are instead functions (and your SDK should have them) to convert a fingerprint image to a list of special feature points (intersections, whorls and so on, called minutiae), along with their relative positioning and other parameters. The level up to which this happens depends on the SDK and library in use: there is more than one method. Being targeted for a very specific sensor helps, but then the methods differ in what they offer (e.g. robustness, invariance to slight fingerprints rotations, and so on). See this paper for an example as well as references to other methods.

Some kits do not allow this (do not supply the feature vector to the user) and only provide the means of comparing two fingerprint images, usually aligning them using PCA and then running a direct minutia matching. This works very well for a few images, but run time for finding an image in a database can be ruinous, so much that specialized hardware exists for the task (google 'Automatic Fingerprint Identification Systems').

Once you do have the feature vector, you need to convert it to a SQL storable object, which can be a string or a series of columns in a tuple. How to do this depends on how the vector is constructed. The nature of this object will dictate what kind of search you'll be able to run. This translation can be done in a number of ways and it definitely isn't something you should try on your own.

This also because even after vectorizing the fingerprints, you will still have no exact match. Not even between two fingerprints from the same person taken within the same minute. You will have instead a number of positive matches and a number of negative matches, and will need to establish a confidence threshold for both ("it's him", "it's not him", "can't say"). As well as deciding whether and to what extend tolerate false positives ("Yes, it's him!" - but it's not) and false negatives ("Nay, it's not him" - and it was). On a door lock you want no false positives, but can tolerate a false negative (you just slide your finger again). In a criminal investigation you can't allow a false negative to let the culprit slip, and you can accept a dozen false positives (you'll check their alibis later...) but not a hundred (you can't check out one hundred persons, and some of them won't have an alibi - no way you can arrest them all).

And for large databases you will always have to run a first pass whereby you restrict the search to those fingerprints that do have a reasonable invariant feature match (for instance, "absolute number of whorls between 75% and 125% of the sample). This is necessary to reduce the number of tuples you will then subject to further non-invariant analysis, which is much more expensive and definitely cannot be done with standard MySQL functions.

A different approach is to transform a fingerprint in a coded string representation, such that search may be done using a reasonable lexicographical approach available in a mainstream database (e.g. regular expression plus Levenshtein distance). Your SDK either supports this string conversion, or it doesn't; it involves one, possibly several transforms in feature and spectral domain. The trustworthiness of the method depends on how many features can be crammed into the string (the more signatures you have, the more precise the match needs to be, the more features you need, the longer the string).

You may be able to use some external library that does the encoding and checking for you using a suitable algorithm.

Even a simple thing such as "returning the closest match or define whether there is a match at all?" heavily depends on how the fingerprint is manipulated before storing. That's why usually SDKs will supply a high-level interface to match a fingerprint, and they'll do the heavy lifting themselves. Sometimes this heavy lifting is not translatable to a database at all (or not without tremendous difficulties); for example if the "enrollment" is actually the training of a neural network, and not the insertion of a feature vector into a database.

Dirty hack

You have nine million users (who are you, the FBI?) and permission to get nine million fingerprints. And you have this one SDK. And matching nine million images is out of the question. But for the reasons stated above, you can only ask the SDK the question, "is this image in your database?" and receive a list of, say, three names with "Yes at 99%, yes at 92%, yes at 90%".

You can perhaps do this: run a very high level, very rough binning on the image, based on something really macroscopic. I don't know, maybe the number of ridge minutiae. You will have to do this by examinining the image; possibly OpenCV might help you. You will get a number ranging from 1 to N, and this number will be unreliable, with an error of say 2%.

The key here is that you must be sure that the SDK will never say that a fingerprint that should go into block X will ever match with one that you binned into block Y.

Then you can build fifty databases with one fiftieth of the users each, supposing (rather: hoping) that your parameter distribution is reasonably flat, and not a steep Gaussian. When analyzing a fingerprint, you can copy in the SDK database directory one of the fifty databases, the one corresponding to the X value from the fingerprint you have, and in which you will have stored only those users with the same X value. Due to uncertainty, some users will be in two databases, but this way you reduce the problem by a factor of fifty (or maybe forty-five).

You still need to call the SDK matching function, since you have no other way of classifying the incoming fingerprint; but you can perhaps reduce the run time to manageable proportions.