I recently posted a question regarding plotting postions on cities in europe as points on a map. See R, get longitude/latitude data for cities and add it to my dataframe
cities xlsx file contains about 20000 cities in europe.
I got an error message when trying to find the latitude/longitude data using geocode. I have inserted part of the code below:
cities <- read.xlsx("EU_city.xlsx",1)
# get frequencies
freq <- as.data.frame(table(cities))
library(plotrix)
freq$Freq <- rescale(freq$Freq, c(1,10)) # c(scale_min, scale_max)
# get cities latitude/longitude - kindly provided by google:
library(ggmap)
lonlat <- geocode(unique(cities))
cities <- cbind(freq, lonlat)
error message:
Error: is.character(location) is not TRUE
I guess the data(cities) in my dataframe is not found in the geocode call. Is there a way to ignore the city in the dtaframe if it is not matched in geocode
Update of question after suggestion.......
tried geocode(as.character(cities))
Then my frame looks like this:
> cities <- cbind(freq, lonlat)
> cities
cities Freq lon lat
1 ARNHEM 1.00 NA NA
2 ATHENS 3.25 NA NA
3 BAAR 1.00 NA NA
4 BAD VILBEL 1.00 NA NA
5 BILTHOVEN 1.00 NA NA
6 BOCHUM 10.00 NA NA
7 BREDA 3.25 NA NA
8 CAMBRIDGESHIRE 3.25 NA NA
9 DORDRECHT 1.00 NA NA
10 GAOETERSLOH 1.00 NA NA
11 GELSENKIRCHEN 1.00 NA NA
12 GOES 1.00 NA NA
13 GRONINGEN 3.25 NA NA
14 GUMMERSBACH-DIERINGHAUSEN 1.00 NA NA
15 HALSTEREN 1.00 NA NA
16 HANNOVER 1.00 NA NA
17 HARDERWIJK 1.00 NA NA
18 HEERLEN 3.25 NA NA
19 HILVERSUM 1.00 NA NA
I got no long/lat data at all, only NA
You have to geocode just the cities
column (it's a little confusing that you have a data frame called cities
, and within it a column called cities
). When in doubt, try breaking things down into smaller chunks.
For example, try them one at a time ...
cities <- c("ARNHEM","ATHENS","BAAR","CAMBRIDGESHIRE")
library(ggmap)
geocode(cities[1])
## lon lat
## 1 5.89873 51.9851
geocode(cities[2])
## just checking ...
geocode("ATHENS GEORGIA")
## lon lat
## 1 -83.38333 33.95
Now try the vector all at once:
geocode(cities)
## lon lat
## 1 5.8987296 51.98510
## 2 23.7293097 37.98372
## 3 8.5286332 47.19585
## 4 0.0965375 52.27619
Now try with a data frame:
mydat <- read.csv(textConnection("
cities,Freq,lon,lat
ARNHEM,1.00,NA,NA
ATHENS,3.25,NA,NA
BAAR,1.00,NA,NA
BAD VILBEL,1.00,NA,NA
BILTHOVEN,1.00,NA,NA
BOGUS_PLACE,2,NA,NA"))
geocodes <- geocode(as.character(mydat$cities))
mydat <- data.frame(mydat[,1:2],geocodes)
## cities Freq lon lat
## 1 ARNHEM 1.00 5.898730 51.98510
## 2 ATHENS 3.25 23.729310 37.98372
## 3 BAAR 1.00 8.528633 47.19585
## 4 BAD VILBEL 1.00 8.739480 50.18234
## 5 BILTHOVEN 1.00 5.210381 52.13653
## 6 BOGUS_PLACE 2.00 -92.201158 44.49091
I don't know what the result for BOGUS_PLACE
means ...!!