Textblob - HTTPError: HTTP Error 429: Too Many Requests

Outcast picture Outcast · May 17, 2019 · Viewed 7.5k times · Source

I am having a dataframe of which one column has a list of strings at each row.

On average, each list has 150 words of about 6 characters each.

Each of the 700 rows of the dataframe is about a document and each string is a word of this document; so basically I have tokenised the words of the document.

I want to detect the language of each of these documents and to do this I firstly try to detect the language of each word of the document.

For this reason I do the following:

from textblob import TextBlob

def lang_detect(document):

    lang_count = {}
    for word in document:

        if len(word) >= 4:

            word_textblob = TextBlob(word)
            lang_result = word_textblob.detect_language()

            response = lang_count.get(lang_result)

            if response is None:  
                lang_count[f"{lang_result}"] = 1
            else:
                lang_count[f"{lang_result}"] += 1

    return lang_count

df_per_doc['languages_count'] = df_per_doc['complete_text'].apply(lambda x: lang_detect(x))

When I do this then I get the following error:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-42-772df3809bcb> in <module>
     25 
---> 27 df_per_doc['languages_count'] = df_per_doc['complete_text'].apply(lambda x: lang_detect(x))
     28 
     29 
.
.
.

    647 class HTTPDefaultErrorHandler(BaseHandler):
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650 
    651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 429: Too Many Requests

The error is much longer and I have omitted the rest of it at the middle.

Now,I am getting the same error even if I try to do this for only two documents/rows.

Is there any way that I can get a response from textblob for more words & documents?

Answer

aykcandem picture aykcandem · May 22, 2020

I had the same issue when I was trying to translate tweets. Since I exceed the rate limit, it started to return HTTP 429 too many requests error.

Therefore, for the others who might want to work on TextBlob, it would be better to check rate limits. Google provides information regarding limits: https://cloud.google.com/translate/quotas?hl=en

If you exceed the rate limits, you have to wait until quotas reset at midnight Pacific Time. It might take 24 hours to become effective again.

On the other hand, you can also introduce a delay between your requests to not bother the API server.

Ex: When you want to translate the TextBlob sentences in the list.

import time
...
for sentence in list_of_sentences:
    sentence.translate()
    time.sleep(1) #to sleep 1 sec