R trying to find latitude/longitude data for cities in europe and getting geocode error messege

jonas picture jonas · Jan 5, 2014 · Viewed 28.2k times · Source

I recently posted a question regarding plotting postions on cities in europe as points on a map. See R, get longitude/latitude data for cities and add it to my dataframe

cities xlsx file contains about 20000 cities in europe.

I got an error message when trying to find the latitude/longitude data using geocode. I have inserted part of the code below:

cities <- read.xlsx("EU_city.xlsx",1)

# get frequencies
freq <- as.data.frame(table(cities))
library(plotrix)
freq$Freq <- rescale(freq$Freq, c(1,10)) # c(scale_min, scale_max)

# get cities latitude/longitude - kindly provided by google:
library(ggmap)
lonlat <- geocode(unique(cities)) 
cities <- cbind(freq, lonlat)

error message:

Error: is.character(location) is not TRUE

I guess the data(cities) in my dataframe is not found in the geocode call. Is there a way to ignore the city in the dtaframe if it is not matched in geocode

Update of question after suggestion.......

tried geocode(as.character(cities))

Then my frame looks like this:

> cities <- cbind(freq, lonlat)
> cities
                       cities  Freq lon lat
1                      ARNHEM  1.00  NA  NA
2                      ATHENS  3.25  NA  NA
3                        BAAR  1.00  NA  NA
4                BAD  VILBEL   1.00  NA  NA
5                   BILTHOVEN  1.00  NA  NA
6                      BOCHUM 10.00  NA  NA
7                       BREDA  3.25  NA  NA
8              CAMBRIDGESHIRE  3.25  NA  NA
9                   DORDRECHT  1.00  NA  NA
10                GAOETERSLOH  1.00  NA  NA
11              GELSENKIRCHEN  1.00  NA  NA
12                       GOES  1.00  NA  NA
13                  GRONINGEN  3.25  NA  NA
14  GUMMERSBACH-DIERINGHAUSEN  1.00  NA  NA
15                  HALSTEREN  1.00  NA  NA
16                   HANNOVER  1.00  NA  NA
17                 HARDERWIJK  1.00  NA  NA
18                    HEERLEN  3.25  NA  NA
19                  HILVERSUM  1.00  NA  NA

I got no long/lat data at all, only NA

Answer

Ben Bolker picture Ben Bolker · Jan 5, 2014

You have to geocode just the cities column (it's a little confusing that you have a data frame called cities, and within it a column called cities). When in doubt, try breaking things down into smaller chunks.

For example, try them one at a time ...

cities <- c("ARNHEM","ATHENS","BAAR","CAMBRIDGESHIRE")
library(ggmap)
geocode(cities[1])
##       lon     lat
## 1 5.89873 51.9851
geocode(cities[2])
## just checking ...
geocode("ATHENS GEORGIA")
##         lon   lat
## 1 -83.38333 33.95

Now try the vector all at once:

geocode(cities)
##          lon      lat
## 1  5.8987296 51.98510
## 2 23.7293097 37.98372
## 3  8.5286332 47.19585
## 4  0.0965375 52.27619

Now try with a data frame:

mydat <- read.csv(textConnection("
   cities,Freq,lon,lat
   ARNHEM,1.00,NA,NA
   ATHENS,3.25,NA,NA
   BAAR,1.00,NA,NA
   BAD VILBEL,1.00,NA,NA
   BILTHOVEN,1.00,NA,NA
   BOGUS_PLACE,2,NA,NA"))


geocodes <- geocode(as.character(mydat$cities))
mydat <- data.frame(mydat[,1:2],geocodes)

##               cities Freq        lon      lat
## 1             ARNHEM 1.00   5.898730 51.98510
## 2             ATHENS 3.25  23.729310 37.98372
## 3               BAAR 1.00   8.528633 47.19585
## 4         BAD VILBEL 1.00   8.739480 50.18234
## 5          BILTHOVEN 1.00   5.210381 52.13653
## 6        BOGUS_PLACE 2.00 -92.201158 44.49091

I don't know what the result for BOGUS_PLACE means ...!!