Reputation: 200
I am trying to download the data from eurostat using eurostat package with R.
The dataset can be downloaded by specificifying its "code_id", which in this case is: "edat_lfse_33"
However, I got stuck because when I try to run the code below, my computer crashes/returns a Memory allocation Error.
library(eurostat)
library(dplyr)
library(ggplot2)
library(stringr)
data=get_eurostat("edat_lfse_33")
That returns a huge tibble as following:
# A tibble: 2,914,673 x 8
unit sex isced11 duration age geo time values
<fct> <fct> <fct> <fct> <fct> <fct> <dbl> <dbl>
1 PC F ED0-2 TOTAL Y15-34 AT 2018 49.9
2 PC F ED0-2 TOTAL Y15-34 AT1 2018 48.4
(..)
Then once I try to join the previous tibble object using get_eurostat_geospatial, my pc gets crashes.
mapdata <- get_eurostat_geospatial(nuts_level = 2, resolution='60',
year=2016,
output_class = 'df') %>%
right_join(data)%>%
mutate(cat = cut_to_classes(values, n=2, decimals = 1))
Could someone help me out?
Upvotes: 1
Views: 89
Reputation: 19349
You can't join two data table when both of them contain duplicates on the join column. The mapdata is OK because it contains the map data in the correct format for ggplot but the data table needs to be summarized before you can be joined.
data2 <- data %>%
group_by(geo) %>%
summarise(Values=mean(values, na.rm=TRUE))
library(ggplot2)
data2 %>% right_join(mapdata, by="geo") %>%
#mutate(cat = cut_to_classes(Values, n=2, decimals = 1)) %>%
ggplot(aes(lat, long, group=group)) +
geom_polygon(aes(fill=Values))
Upvotes: 2