torchtext ImportError in colab

Aditya Shrivastava picture Aditya Shrivastava · Jan 5, 2021 · Viewed 7k times · Source

I am trying to run this tutorial in colab.

However, when I try to import a bunch of modules:

import io
import torch
from torchtext.utils import download_from_url
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator

It gives me the errors for extract_archive and build_vocab_from_iterator:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-5-a24e72502dbc> in <module>()
      1 import io
      2 import torch
----> 3 from torchtext.utils import download_from_url, extract_archive
      4 from torchtext.data.utils import get_tokenizer
      5 from torchtext.vocab import build_vocab_from_iterator

ImportError: cannot import name 'extract_archive'


ImportError                               Traceback (most recent call last)
<ipython-input-4-02a401fd241b> in <module>()
      3 from torchtext.utils import download_from_url
      4 from torchtext.data.utils import get_tokenizer
----> 5 from torchtext.vocab import build_vocab_from_iterator
      6 
      7 url = 'https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip'

ImportError: cannot import name 'build_vocab_from_iterator'

Please help me with this one.

Answer

korakot picture korakot · Jan 5, 2021

You need to upgrade torchtext first

!pip install -U torchtext==0.8.0

Currently, version 0.8.0 works with torch 1.7.0 (no need to upgrade torch, torchvision)

Update (sep 2021)

Currently, torchtext is already 0.10.0 and you don't need to upgrade anything.