I have a number of data files created by many different programs. Is there a way to determine the database and version of the database that was used to create the data file.
For example, I'd like to identify which files are created from Microsoft Access, dBASE, FileMaker, FoxPro, SQLite or others.
I really just want to somehow quickly scan the files, and display information about them, including source Database and Version.
For reference, I'm using Delphi 2009.
First of all, check the file extension. Take a look at the corresponding wikipedia article, or other sites.
Then you can guess the file format from its so called "signature".
This is mostly the first characters content, which is able to identify the file format.
You've an updated list at this very nice Gary Kessler's website.
For instance, here is how our framework identify the MIME format from the file content, on the server side:
function GetMimeContentType(Content: Pointer; Len: integer;
const FileName: TFileName=''): RawUTF8;
begin // see http://www.garykessler.net/library/file_sigs.html for magic numbers
result := '';
if (Content<>nil) and (Len>4) then
case PCardinal(Content)^ of
$04034B50: Result := 'application/zip'; // 50 4B 03 04
$46445025: Result := 'application/pdf'; // 25 50 44 46 2D 31 2E
$21726152: Result := 'application/x-rar-compressed'; // 52 61 72 21 1A 07 00
$AFBC7A37: Result := 'application/x-7z-compressed'; // 37 7A BC AF 27 1C
$75B22630: Result := 'audio/x-ms-wma'; // 30 26 B2 75 8E 66
$9AC6CDD7: Result := 'video/x-ms-wmv'; // D7 CD C6 9A 00 00
$474E5089: Result := 'image/png'; // 89 50 4E 47 0D 0A 1A 0A
$38464947: Result := 'image/gif'; // 47 49 46 38
$002A4949, $2A004D4D, $2B004D4D:
Result := 'image/tiff'; // 49 49 2A 00 or 4D 4D 00 2A or 4D 4D 00 2B
$E011CFD0: // Microsoft Office applications D0 CF 11 E0 = DOCFILE
if Len>600 then
case PWordArray(Content)^[256] of // at offset 512
$A5EC: Result := 'application/msword'; // EC A5 C1 00
$FFFD: // FD FF FF
case PByteArray(Content)^[516] of
$0E,$1C,$43: Result := 'application/vnd.ms-powerpoint';
$10,$1F,$20,$22,$23,$28,$29: Result := 'application/vnd.ms-excel';
end;
end;
else
case PCardinal(Content)^ and $00ffffff of
$685A42: Result := 'application/bzip2'; // 42 5A 68
$088B1F: Result := 'application/gzip'; // 1F 8B 08
$492049: Result := 'image/tiff'; // 49 20 49
$FFD8FF: Result := 'image/jpeg'; // FF D8 FF DB/E0/E1/E2/E3/E8
else
case PWord(Content)^ of
$4D42: Result := 'image/bmp'; // 42 4D
end;
end;
end;
if (Result='') and (FileName<>'') then begin
case GetFileNameExtIndex(FileName,'png,gif,tiff,tif,jpg,jpeg,bmp,doc,docx') of
0: Result := 'image/png';
1: Result := 'image/gif';
2,3: Result := 'image/tiff';
4,5: Result := 'image/jpeg';
6: Result := 'image/bmp';
7,8: Result := 'application/msword';
else begin
Result := RawUTF8(ExtractFileExt(FileName));
if Result<>'' then begin
Result[1] := '/';
Result := 'application'+LowerCase(Result);
end;
end;
end;
end;
if Result='' then
Result := 'application/octet-stream';
end;
You can use a similar function, from the GAry Kessler's list.