How to extract countries from a text?

Markus picture Markus · Feb 4, 2018 · Viewed 9.5k times · Source

I use Python 3 (I also have Python 2 installed) and I want to extract countries or cities from a short text. For example, text = "I live in Spain" or text = "United States (New York), United Kingdom (London)".

The answer for countries:

  1. Spain
  2. [United States, United Kingdom]

I tried to install geography but I am unable to run pip install geography. I get this error:

Collecting geography Could not find a version that satisfies the requirement geography (from versions: ) No matching distribution found for geography

It looks like geography only works with Python 2.

I also have geopandas, but I don't know how to extract the required info from text using geopandas.

Answer

matyas picture matyas · Feb 4, 2018

you could use pycountry for your task (it also works with python 3):

pip install pycountry

import pycountry
text = "United States (New York), United Kingdom (London)"
for country in pycountry.countries:
    if country.name in text:
        print(country.name)