Which geopandas datasets (maps) are available?

Martin Thoma picture Martin Thoma · Jul 31, 2018 · Viewed 14.4k times · Source

I just created a very simple geopandas example (see below). It works, but I noticed that it is important for me to be able to have a custom part of the world. Sometimes Germany and sometimes only Berlin. (Also, I want to aggregate the data I have by areas which I define as polygons in a geopandas file, but I'll add this in another question.)

How can I get a different "base map" than

world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))

for visualizations?

Example

# 3rd party modules
import pandas as pd
import geopandas as gpd
import shapely
# needs 'descartes'

import matplotlib.pyplot as plt

df = pd.DataFrame({'city': ['Berlin', 'Paris', 'Munich'],
                   'latitude': [52.518611111111, 48.856666666667, 48.137222222222],
                   'longitude': [13.408333333333, 2.3516666666667, 11.575555555556]})
gdf = gpd.GeoDataFrame(df.drop(['latitude', 'longitude'], axis=1),
                       crs={'init': 'epsg:4326'},
                       geometry=[shapely.geometry.Point(xy)
                                 for xy in zip(df.longitude, df.latitude)])
print(gdf)

world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
base = world.plot(color='white', edgecolor='black')
gdf.plot(ax=base, marker='o', color='red', markersize=5)

plt.show()

Answer

Martin Thoma picture Martin Thoma · Aug 1, 2018

As written in the geopandas.datasets.get_path(...) documentation, one has to execute

>>> geopandas.datasets.available
['naturalearth_lowres', 'naturalearth_cities', 'nybb']

Where

  • naturalearth_lowres: contours of countries
  • naturalearth_cities: positions of cities
  • nybb: maybe New York?

Other data sources

Searching for "germany shapefile" gave an arcgis.com url which used the "Bundesamt für Kartographie und Geodäsie" as a source. The result of using vg2500_geo84/vg2500_krs.shp looks like this:

enter image description here

Source:

© Bundesamt für Kartographie und Geodäsie, Frankfurt am Main, 2011 Vervielfältigung, Verbreitung und öffentliche Zugänglichmachung, auch auszugsweise, mit Quellenangabe gestattet.

I also had to set base.set_aspect(1.4), otherwise it looked wrong. The value 1.4 was found by trial and error.

Another source for such data for Berlin is daten.berlin.de

When geopandas reads the shapefile, it is a geopandas dataframe with the columns

['USE', 'RS', 'RS_ALT', 'GEN', 'SHAPE_LENG', 'SHAPE_AREA', 'geometry']

with:

  • USE=4 for all elements
  • RS is a string like 16077 or 01003
  • RS_ALT is a string like 160770000000 or 010030000000
  • GEN is a string like 'Saale-Holzland-Kreis' or 'Erlangen'
  • SHAPE_LENG is a float like 202986.1998816 or 248309.91235015
  • SHAPE_AREA is a float like 1.91013141e+08 or 1.47727769e+09
  • geometry is a shapely geometry - mostly POLYGON