I'm working on a codebase that uses Spacy. I installed spacy using:
sudo pip3 install spacy
and then
sudo python3 -m spacy download en
At the end of this last command, I got a message:
Linking successful
/home/rayabhik/.local/lib/python3.5/site-packages/en_core_web_sm -->
/home/rayabhik/.local/lib/python3.5/site-packages/spacy/data/en
You can now load the model via spacy.load('en')
Now, when I try running my code, on the line:
from spacy.en import English
it gives me the following error:
ImportError: No module named 'spacy.en'
I've looked on Stackexchange and the closest is: Import error with spacy: "No module named en" which does not solve my problem.
Any help would be appreciated. Thanks.
Edit: I might have solved this by doing the following:
Python 3.5.2 (default, Sep 14 2017, 22:51:06)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import spacy
>>> spacy.load('en')
<spacy.lang.en.English object at 0x7ff414e1e0b8>
and then using:
from spacy.lang.en import English
I'm still keeping this open in case there are any other answers.
Yes, I can confirm that your solution is correct. The version of spaCy you downloaded from pip is v2.0, which includes a lot of new features, but also a few changes to the API. One of them is that all language data has been moved to a submodule spacy.lang
to keep thing cleaner and better organised. So instead of using spacy.en
, you now import from spacy.lang.en
.
- from spacy.en import English
+ from spacy.lang.en import English
However, it's also worth mentioning that what you download when you run spacy download en
is not the same as spacy.lang.en
. The language data shipped with spaCy includes the static data like tokenization rules, stop words or lemmatization tables. The en
package that you can download is a shortcut for the statistical model en_core_web_sm
. It includes the language data, as well as binary weight to enable spaCy to make predictions for part-of-speech tags, dependencies and named entities.
Instead of just downloading en
, I'd actually recommend using the full model name, which makes it much more obvious what's going on:
python -m spacy download en_core_web_sm
nlp = spacy.load("en_core_web_sm")
When you call spacy.load
, spaCy does the following:
"en_core_web_sm"
(a package or shortcut link).meta.json
and check which language it's using (in this case, spacy.lang.en
), and how its processing pipeline should look (in this case, tagger
, parser
and ner
).See this section in the docs for more details.