I want two compare two voice files and find difference between them. Suppose I have an original file of a music track and another file that is read out of that music by me, I want two compare these two files and find difference between their phonemes. Actually the words aren't important for me, the important thing is similarity in phonemes and I want to find something like the percentage of similarity. I prefer to implement something with Python.
You could try audio fingerprinting using fpcalc in Chromaprint.
Chromaprint is the core component of the AcoustID project. The audio fingerprinting is done using fpcalc in Chromaprint. fpcalc should be placed in the same directory as the Python script.
Python bindings for Chromaprint acoustic fingerprinting and the Acoustid API.
https://pypi.python.org/pypi/pyacoustid
Bellow is an example article with python demo code.
Comparing Non-Identical Audio Files for Duplicate Content with Cross-Correlated Fingerprints http://www.randombytes.org/audio_comparison.html
How does Chromaprint work?