Are there any known regular expressions out there to validate credit card track 1 and track 2 data?
EDIT:
From Wikipedia:
The information on track 1 on financial cards is contained in several formats: A, which is reserved for proprietary use of the card issuer, B, which is described below, C-M, which are reserved for use by ANSI Subcommittee X3B10 and N-Z, which are available for use by individual card issuers:
Track 1, Format B:
Track 2: This format was developed by the banking industry (ABA). This track is written with a 5-bit scheme (4 data bits + 1 parity), which allows for sixteen possible characters, which are the numbers 0-9, plus the six characters : ; < = > ? . The selection of six punctuation symbols may seem odd, but in fact the sixteen codes simply map to the ASCII range 0x30 through 0x3f, which defines ten digit characters plus those six symbols. The data format is as follows:
Here is a REGEX that works for me to pick both Track 1 and Track 2. Use this with the regex option "Dot does NOT match newline".
^%(?<FC>.)(?<PAN>[\d]{1,19}+)\^(?<NM>.{2,26})\^(?<ED>[\d]{0,4}|\^)(?<SC>[\d]{0,3}|\^)(?<DD>.*)\?|;(?<PAN>[\d]{1,19}+)=(?<ED>[\d]{0,4}|=)(?<SC>[\d]{0,3}|=)(?<DD>.*)\?\Z
I tested with this data (my reader is reading both a Track 1 and Track 2 record, in this order, for the same card I tested with - numbers and name changed below.)
%B5581123456781323^SMITH/JOHN^16071021473810559010203?
;5581123456781323=160710212423468?
The above REGEX uses NAMED CAPTURE GROUPS (the "?" that starts out each (group)) and I see the result (with RegexBuddy) as:
Match 1: %B5581123456781323^SMITH/JOHN^16071021473810559010203? 0 54
Group "FC": B 1 1
Group "PAN": 5581123456781323 2 16
Group "NM": SMITH/JOHN 19 10
Group "ED": 1607 30 4
Group "SC": 102 34 3
Group "DD": 1473810559010203 37 16
Match 2: ;5581123456781323=160710212423468? 56 34
Group "FC" did not participate in the match
Group "PAN": 5581123456781323 57 16
Group "NM" did not participate in the match
Group "ED": 1607 74 4
Group "SC": 102 78 3
Group "DD": 12423468 81 8
Note the second match does NOT identify FC (format code) and NM (name) in the Track 2 (match 2) since they are not used in track 2.
If your regex engine does not support NAMED GROUPS, just kill the "?" part of each of the capturing groups. Then, use position to determine each group.
Also, my single SWIPE contains BOTH track 1 and track 2 (in that order, track 1, a crlf and then track 2). According to the Wikipedia link in the original question, cards can have up to 3 tracks and readers might read tracks 1 and 2 both (or one or the other) and rarely track 3.
For this reason, I think it's a safe bet to use a REGEX that looks for both track 1 and track 2 and if you get both, you can ignore track 2 (since track 1 has more data) or whatever you wish.
Because both tracks are present in my swipes, the REGEX engine will return 2 matches with my REGEX above (assuming no read error from the reader and a reader that supports both tracks). In my case, this does not bother me and I'll simply plan to use the "first match" and ignore the second.
If you're interested only in track 1, use this regex:
^%(?<FC>.)(?<PAN>[\d]{1,19}+)\^(?<NM>.{2,26})\^(?<ED>[\d]{0,4}|\^)(?<SC>[\d]{0,3}|\^)(?<DD>.*)\?\Z
If you're interested only in track 2, use the regex:
^;(?<PAN>[\d]{1,19}+)=(?<ED>[\d]{0,4}|=)(?<SC>[\d]{0,3}|=)(?<DD>.*)\?\Z
But I see no harm in checking for both and then using the first one you get, or perhaps comparing track 1 to track 2 as an additional error checking step perhaps.
Sorry to answer what seems to be answered!