Reputation: 13
I threw in some more reproducible code but the error is persisting.
Sorry in advance for the lack of a better question, the sample code is breaking before I have a change to dive into it. I've also never dealt with "./{rest of url}" before.is that the problem? I'm working with this https://programminghistorian.org/en/lessons/geospatial-data-analysis
and I'm getting this error "Error: Cannot open "./data/County1990ussm/"; The file doesn't seem to exist."
I've verified I'm in the intended working directory (one level above the data folder)
"We start by loading in the selected data. The data for this tutorial can be dowloaded here - https://programminghistorian.org/assets/geospatial-data-analysis/data.zip - . Once downloaded place all the files in a folder labeled data inside your working directory in R. We are going to create a variable and read in our data from our variable directory to it. Once run, the County_Aggregate_Data variable will contain the data and geographic information that we will analyze:"
library(sf)
library(tmap)
library(plotly)
setwd("path")
aFile <- "https://programminghistorian.org/assets/geospatial-data-analysis/data.zip"
# check to see whether file exists before downloading and unzipping it
if(!file.exists("data.zip")) {
download.file(aFile,"data.zip",mode="wb")
unzip("data.zip")
}
print(list.files("./data"))
County_Aggregate_Data <- st_read("./data/County1990ussm/")
OUTPUT
"[1] "County1990_Data" "County1990ussm"
[3] "DP_TableDescriptions.xls" "ExtendedZIP5.csv"
[5] "GeocodedAddresses.csv" "Religion
"
"Error: Cannot open "./data/County1990ussm/" The file doesn't seem to exist."
Upvotes: 0
Views: 3464
Reputation: 10855
If the original poster downloaded the zip file into the ./data
directory and then unzipped it with default settings, the unzip creates another /data
subdirectory.
Without seeing the contents of the original poster's data directory we can't tell whether this is the case. In any event, we can demonstrate how to download the file, unzip it and load one of its component files into R in a single script.
Here is a script that downloads the zip file from the website into the current R working directory, unzips it to ./data
and reads the data. We use the mode="wb"
argument in download.file()
to tell R to use a binary download instead of a text download.
aFile <- "https://programminghistorian.org/assets/geospatial-data-analysis/data.zip"
# check to see whether file exists before downloading and unzipping it
if(!file.exists("data.zip")) {
download.file(aFile,"data.zip",mode="wb")
unzip("data.zip")
}
Having downloaded and unzipped the file, we can verify that its contents have been extracted to ./data
with list.files()
.
# confirm that data is in the right directory
list.files("./data")
> list.files("./data")
[1] "County1990_Data" "County1990ussm"
[3] "DP_TableDescriptions.xls" "ExtendedZIP5.csv"
[5] "GeocodedAddresses.csv" "Religion"
Now that we've confirmed the presence of County1990ussm
, we can load the file into memory using the code from the original post.
library(sf)
library(tmap)
library(plotly)
County_Aggregate_Data <- st_read("./data/County1990ussm/")
head(County_Aggregate_Data)
...and the output:
> head(County_Aggregate_Data)
Simple feature collection with 6 features and 20 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -1224327 ymin: -932167.3 xmax: 1843060 ymax: 1066589
Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic
DECADE NHGISNAM NHGISST NHGISCTY ICPSRST ICPSRCTY ICPSRNAM STATENAM
1 1990 York 420 1330 14 1330 YORK Pennsylvania
2 1990 Sherman 200 1810 32 1810 SHERMAN Kansas
3 1990 Onslow 370 1330 47 1330 ONSLOW North Carolina
4 1990 Gallatin 300 0310 64 310 GALLATIN Montana
5 1990 Ocean 340 0290 12 290 OCEAN New Jersey
6 1990 Uvalde 480 4630 49 4630 UVALDE Texas
ICPSRSTI ICPSRCTYI ICPSRFIP STATE COUNTY PID X_CENTROID Y_CENTROID GISJOIN
1 14 1330 0 420 1330 936 1621651.3 436217.9 G4201330
2 32 1810 0 200 1810 1078 -488123.2 222198.8 G2001810
3 47 1330 0 370 1330 1114 1675585.4 -145184.9 G3701330
4 64 310 0 300 0310 1350 -1179798.5 996189.2 G3000310
5 12 290 0 340 0290 2426 1823185.7 479312.4 G3400290
6 49 4630 0 480 4630 2979 -365338.2 -901445.2 G4804630
GISJOIN2 SHAPE_AREA SHAPE_LEN geometry
1 4201330 2357546914 252994.0 MULTIPOLYGON (((1617516 458...
2 2001810 2735057979 209726.2 MULTIPOLYGON (((-460646 244...
3 3701330 1993173332 723453.4 MULTIPOLYGON (((1680399 -17...
4 3000310 6559312252 558780.0 MULTIPOLYGON (((-1160194 10...
5 3400290 1704201820 649824.4 MULTIPOLYGON (((1833940 506...
6 4804630 4036829106 254728.5 MULTIPOLYGON (((-348764 -87...
>
Another benefit of this approach is that the analysis is reproducible. That is, since the source data file is referenced in the script, unless the programming historian website is taken down one can conduct the analysis without already having the data file.
Upvotes: 2