I am creating a program that I will use to help my customers recover passwords placed on office documents like word and excel. The program works just fine but it is MUCH slower than similar products that you can download for free. I would like to use my own program because I feel like a lot of the ones you download for free aren't completely safe and lack some of the controls that I would like to have.
More to the point... I need help figuring out why my program is so much slower. I created an excel document with a simple 3 letter password "TFX". The program I downloaded find the password almost as fast as I can let go of the mouse button after clicking on 'go'. My program takes 10 minutes. Here's the 3 character loop:
private string ThreeCharPass(string file, Microsoft.Office.Interop.Excel.Application exApp, char[] combarr)
{
for (int three = 0; three < combarr.Length; three++)
{
for (int two = 0; two < combarr.Length; two++)
{
for (int one = 0; one < combarr.Length; one++)
{
try
{
string pass = combarr[three].ToString() + combarr[two].ToString() + combarr[one].ToString();
exApp.Workbooks.Open(file, false, true, Type.Missing, pass, Type.Missing, true, Type.Missing, Type.Missing, false, false, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
return pass;
}
catch
{
}
}
}
}
return string.Empty;
}
The array 'combarr' is an array of chars containing all the possible characters in the password. It's generated earlier in the program based on user selected options. I'm thinking the issue has to be in the way I'm looping through the array to create the password combinations because just in this 3 character password method it spends more than 5 minutes where other 'professional' programs spend seconds. Any feedback would be greatly appreciated!!
There are some minor optimizations that you could do on your code, but the most likely culprit is the exApp.Workbooks.Open
call. I think that call is very slow, but you could test that with a profiler.
What other tools do is to read the actual document structure (DOC, DOCX format) and figure out whether the password is correct the exact same way that Word would try to figure it out. I don't know the exact details, but it is very likely that there is a way by which Word knows that the password was correct. An example could be a unique string that, when decrypted correctly, has the expected value; or a checksum that adds up. When you know the specification of the format, you can do that test yourself, saving you the expensive interop call.
This page has detailed information about many of the Microsoft Office formats. It is a lot of work to implement parts of those specifications, but it will surely speed things up. Only once you've removed the interop call, you could take a look at more efficient loops, multithreading, and other strategies.
Note that the Office formats are proprietary, so not all information may be available, complete, up-to-date or reliable.