Translate Hindi language to English using Python

Nitin Patil picture Nitin Patil · Mar 3, 2016 · Viewed 10.2k times · Source

I'm doing a project on recognizing language(English, Hindi, Marathi, etc..) depends on origin language code and translate it into another language depends on input targeted language code.

Everything is done in Python language.

Google API to recognize language and interpret in text format then using Microsoft API translate it into another language.

But I'm facing an error, here it is

Traceback(most recent call last):
  File "pitranslate.py", line 60, in < module >
  translation_result = requests.get(translation_url + urllib.urlencode(translation_args), headers = headers)
File "/usr/lib/python2.7/urllib.py", line 1332, in urlencode
v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can 't encode characters in position 0-3: ordinal not in range(128)

My input: क्या कर रहे हो

Here is the complete code:

import json
import requests
import urllib
import subprocess
import argparse
import speech_recognition as sr
from subprocess import call

parser = argparse.ArgumentParser(description='This is a demo script by DaveConroy.com.')
parser.add_argument('-o','--origin_language', help='Origin Language',required=True)
parser.add_argument('-d','--destination_language', help='Destination Language', required=True)
#parser.add_argument('-t','--text_to_translate', help='Text to Translate', required=True)
args = parser.parse_args()

## show values ##
print ("Origin: %s" % args.origin_language )
print ("Destination: %s" % args.destination_language )
#print ("Text: %s" % args.text_to_translate )

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")

    audio = r.listen(source)
args.text_to_translate = r.recognize_google(audio, language=args.origin_language) 
text = args.text_to_translate
#text=r.recognize_google(audio)
print text
origin_language=args.origin_language
destination_language=args.destination_language


def speakOriginText(phrase):
    googleSpeechURL = "http://translate.google.com/translate_tts?tl="+ origin_language +"&q=" + phrase
    subprocess.call(["mplayer",googleSpeechURL], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

def speakDestinationText(phrase):
    googleSpeechURL = "http://translate.google.com/translate_tts?tl=" + destination_language +"&q=" + phrase
    print googleSpeechURL
    subprocess.call(["mplayer",googleSpeechURL], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

args = {
        'client_id': 'create and enter your client id',
        'client_secret': 'create id and enter here',#your azure secret here
        'scope': 'http://api.microsofttranslator.com',
        'grant_type': 'client_credentials'
    }

oauth_url = 'https://datamarket.accesscontrol.windows.net/v2/OAuth2-13'
oauth_junk = json.loads(requests.post(oauth_url,data=urllib.urlencode(args)).content)
translation_args = {
        'text': text,
        'to': destination_language,
        'from': origin_language
        }

headers={'Authorization': 'Bearer '+oauth_junk['access_token']}
translation_url = 'http://api.microsofttranslator.com/V2/Ajax.svc/Translate?'
translation_result = requests.get(translation_url+urllib.urlencode(translation_args),headers=headers)
translation=translation_result.text[2:-1]

speakOriginText('Translating ' + translation_args["text"])
speakDestinationText(translation)

How to overcome this error?

Answer

Jack Sparrow picture Jack Sparrow · Mar 8, 2017

For this error you have to decode your text in Utf-8 like example You have text in other language My_input=क्या कर रहे हो now for use this text to convert or translate you have to use decode

My_input=क्या कर रहे हो
My_input.decode("utf-8")

like this you can decode and encode strings