How to identify the file type even though the file-extension has been changed?

Maximin picture Maximin · Mar 22, 2013 · Viewed 10.3k times · Source

Files are categorized by file-extension. So my question is, how to identify the file type even the file extension has been changed.

For example, i have a video file with name myVideo.mp4, i have changed it to myVideo.txt. So if i double-click it, the preferred text editor will open the file, and won't open the exact content. But, if i play myVideo.txt in a video player, the video will be played without any problem.

I was just thinking of developing an application to determine the type of file without checking the file-extension and suggesting the software for opening the file. I would like to develop the application in Java.

Answer

Adi picture Adi · Mar 22, 2013

One of the best libraries to do this is Apache Tika. It doesn't only read the file's header, it's also capable of performing content analysis to detect the file type. Using Tika is very simple, here's an example of detecting a file's type:

import java.net.URL;
import org.apache.tika.Tika; //Including Tika

public class TestTika {

    public static void main(String[] args) {
        Tika tika = new Tika();
        String fileType = tika.detect(new URL("http://example.com/someFile.jpg"));
        System.out.println(fileType);
    }

}