roshualine
roshualine

Reputation: 81

Downloading Geonames

I am interested in downloading Lake Geonames for Canada. Max. rows that can be downloaded per day is 1000. When I run the below code, few records are being missed and some records are overlapped. Is there a way to get total number of lake geonames records available and download the record only once without any overlap ?

library(geonames); GN_lake <- GNsearch(featureCode='LK', country='CA',startRow=1,maxRows = 1000) 

GN_lake <- GNsearch(featureCode='LK', country='CA',startRow=1000, maxRows=1000)

Upvotes: 1

Views: 478

Answers (1)

hrbrmstr
hrbrmstr

Reputation: 78832

Why not just work with the CA database locally?

library(httr)
library(tidyverse)

# Get CA database
httr::GET(
  url = "http://download.geonames.org/export/dump/CA.zip",
  httr::write_disk("CA.zip"),
  httr::progress()
) -> res

# unzip it
unzip("CA.zip")

read.csv( # readr::read_tsv doesn't like this file at least when I read it
  file = "CA.txt",
  header = FALSE,
  sep = "\t",
  col.names = c(
    "geonameid", "name", "asciiname", "alternatenames", "latitude",
    "longitude", "feature_class", "feature_code", "country", "cc2",
    "admin1_code1", "admin2_code", "admin3_code", "admin4_code",
    "population", "elevation", "dem", "timezone", "modification_date"
  ),
  stringsAsFactors = FALSE
) %>% tbl_df() -> ca_geo

filter(ca_geo, feature_code == "LK")
## # A tibble: 104,663 x 19
##    geonameid name          asciiname     alternatenames latitude longitude
##        <int> <chr>         <chr>         <chr>             <dbl>     <dbl>
##  1   5881640 101 Mile Lake 101 Mile Lake ""                 51.7    -121. 
##  2   5881642 103 Mile Lake 103 Mile Lake ""                 51.7    -121. 
##  3   5881644 105 Mile Lake 105 Mile Lake ""                 51.7    -121. 
##  4   5881647 108 Mile Lake 108 Mile Lake ""                 51.7    -121. 
##  5   5881660 130 Mile Lake 130 Mile Lake ""                 51.9    -122. 
##  6   5881666 16 1/2 Mile … 16 1/2 Mile … ""                 52.7    -118. 
##  7   5881668 180 Lake      180 Lake      ""                 57.4    -130. 
##  8   5881673 {1}útsaw Lake {1}utsaw Lake ""                 62.7    -137. 
##  9   5881680 24 Mile Lake  24 Mile Lake  ""                 46.5     -82.0
## 10   5881683 28 Mile Lake  28 Mile Lake  ""                 54.8    -124. 
## # ... with 104,653 more rows, and 13 more variables: feature_class <chr>,
## #   feature_code <chr>, country <chr>, cc2 <chr>, admin1_code1 <int>,
## #   admin2_code <chr>, admin3_code <int>, admin4_code <chr>,
## #   population <int>, elevation <int>, dem <int>, timezone <chr>,
## #   modification_date <chr>

Upvotes: 2

Related Questions