How can I determine if a file is binary or text in c#?

Pablo Retyk picture Pablo Retyk · May 26, 2009 · Viewed 44.9k times · Source

I need to determine in 80% if a file is binary or text, is there any way to do it even quick and dirty/ugly in c#?

Answer

zvolkov picture zvolkov · May 26, 2009

There's a method called Markov Chains. Scan a few model files of both kinds and for each byte value from 0 to 255 gather stats (basically probability) of a subsequent value. This will give you a 64Kb (256x256) profile you can compare your runtime files against (within a % threshold).

Supposedly, this is how browsers' Auto-Detect Encoding feature works.