Reputation: 1089
I'm trying to download and parse the Data worksheet in the file ie_data.xls from Professor Robert Shiller's home page (http://www.econ.yale.edu/~shiller/data.htm). I download the file from http://www.econ.yale.edu/~shiller/data/ie_data.xls, and then run the following script:
library(tidyverse)
ie_data <- read_excel("ie_data.xls", sheet = "Data", col_names = TRUE,
col_types = "numeric", na = "", skip = 7) %>%
select(Date,E) %>%
drop_na()
A bunch of warnings are generated, but more bothersome is the output
> names(ie_data)
[1] "Date" "E"
> ie_data
# A tibble: 1,791 x 2
Date E
<dbl> <dbl>
1 1871. 0.4
2 1871. 0.4
3 1871. 0.4
4 1871. 0.4
5 1871. 0.4
6 1871. 0.4
7 1871. 0.4
8 1871. 0.4
9 1871. 0.4
10 1871. 0.4
# ... with 1,781 more rows
Warning message:
`...` is not empty.
We detected these problematic arguments:
* `needs_dots`
These dots only exist to allow future extensions and should be empty.
Did you misspecify an argument?
The contents of both columns should have two decimal places (1871.01 represents January 1871, 1871.02 represents February 1871 and so on, and the second column is earnings per share rounded to the nearest penny), but everything after the decimal point is gone in the first column at the head of the dataframe! Even more mysterious is its tail:
> tail(ie_data)
# A tibble: 6 x 2
Date E
<dbl> <dbl>
1 2019. 135.
2 2019. 137.
3 2019. 139.
4 2020. 132.
5 2020. 124.
6 2020. 116.
Warning message:
`...` is not empty.
We detected these problematic arguments:
* `needs_dots`
These dots only exist to allow future extensions and should be empty.
Did you misspecify an argument?
Now both columns have lost their fractional part! What change do I need to make to my code in order to read these columns correctly?
Sincerely and with many thanks in advance
Thomas Philips
Upvotes: 0
Views: 1034
Reputation: 314
try it with col_types = "text"
Don't really know why numeric will get you trimmed numbers but i seem to get it working with text (provided you later convert to a rounded number)
Upvotes: 1
Reputation: 7385
You can do the following to see more significant digits in your console when printing your data with ie_data
. This doesn't affect your data, only the way it is shown when printed to your console.
options(pillar.sigfig = 10)
ie_data
Which will show:
Date E
<dbl> <dbl>
1 1871.01 0.4
2 1871.02 0.4
3 1871.03 0.4
4 1871.04 0.4
5 1871.05 0.4
6 1871.06 0.4
7 1871.07 0.4
8 1871.08 0.4
9 1871.09 0.4
10 1871.1 0.4
# ... with 1,781 more rows
If you use the following:
options(pillar.sigfig = 1)
ie_data
You will get:
# A tibble: 1,791 x 2
Date E
<dbl> <dbl>
1 1871. 0.4
2 1871. 0.4
3 1871. 0.4
4 1871. 0.4
5 1871. 0.4
6 1871. 0.4
7 1871. 0.4
8 1871. 0.4
9 1871. 0.4
10 1871. 0.4
# ... with 1,781 more rows
Upvotes: 1