Reputation: 41
I have tried to convert the character variable into integer variable using as.integer
function. However, when the code is executed, the output returns the values as NA
. The code is as follows,
library(tidyverse)
coal_data <- read.csv("http://594442.youcanlearnit.net/coal.csv", skip = 2)
coal_data %>% glimpse()
colnames(coal_data)[1] <- "region"
coal_long <- gather(coal_data, 'year', 'coal_consumption', -region)
coal_long %>% glimpse()
coal_long %>% separate(year, into = c("x", "year"), sep = "X")%>%
select(-x)%>% glimpse()
class(coal_long$year)
coal_long$year <- as.integer(coal_long$year)
The output was as follows
coal_long %>% glimpse()
Rows: 6,960
Columns: 3
$ region <fct> "North America", "Bermuda", "Canada", "Greenland", "Mexico",...
$ year <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ coal_consumption <chr> "16.45179", "0", "0.96156", "0.00005", "0.10239", "0", "15.3...
Actual output expected is getting the year in integer form. Many thanks in advance for looking into this.
Upvotes: 0
Views: 539
Reputation: 3923
May as well make coal_consumption a double while you're at it...
library(tidyverse)
coal_data <- read.csv("http://594442.youcanlearnit.net/coal.csv", skip = 2, na.strings = "--")
colnames(coal_data)[1] <- "region"
coal_long <- gather(coal_data, 'year', 'coal_consumption', -region)
coal_long %>% glimpse()
#> Rows: 6,960
#> Columns: 3
#> $ region <chr> "North America", "Bermuda", "Canada", "Greenland", "…
#> $ year <chr> "X1980", "X1980", "X1980", "X1980", "X1980", "X1980"…
#> $ coal_consumption <dbl> 16.45179, 0.00000, 0.96156, 0.00005, 0.10239, 0.0000…
coal_long <- coal_long %>% separate(year, into = c("x", "year"), sep = "X") %>%
select(-x) %>% glimpse()
#> Rows: 6,960
#> Columns: 3
#> $ region <chr> "North America", "Bermuda", "Canada", "Greenland", "…
#> $ year <chr> "1980", "1980", "1980", "1980", "1980", "1980", "198…
#> $ coal_consumption <dbl> 16.45179, 0.00000, 0.96156, 0.00005, 0.10239, 0.0000…
class(coal_long$year)
#> [1] "character"
coal_long$year <- as.integer(str_remove(coal_long$year, "X"))
glimpse(coal_long)
#> Rows: 6,960
#> Columns: 3
#> $ region <chr> "North America", "Bermuda", "Canada", "Greenland", "…
#> $ year <int> 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980…
#> $ coal_consumption <dbl> 16.45179, 0.00000, 0.96156, 0.00005, 0.10239, 0.0000…
Upvotes: 3
Reputation: 461
You need to re-assign the coal_long
after removing the X
in the year
column.
coal_long <- coal_long %>%
separate(year, into = c("x", "year"), sep = "X") %>%
select(-x) %>%
glimpse()
coal_long$year <- as.integer(coal_long$year)
coal_long %>% glimpse()
Rows: 6,960
Columns: 3
$ region <fct> "North America", "Bermuda", "Canada", "Greenland", "Mexico", "Saint Pierre and Miquelon", "United States", "Cent…
$ year <int> 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980…
$ coal_consumption <chr> "16.45179", "0", "0.96156", "0.00005", "0.10239", "0", "15.38779", "0.42011", "0", "0", "0.03476", "--", "0", "0…
Upvotes: 2
Reputation: 2253
You need to remove the letters from coal_long$year
before you convert to an integer. Try something like this.
coal_long$year # X1980 X1981 X1982 X1983, etc.
as.integer(str_remove(coal_long$year, "X"))
Here's a more generic approach that extracts all digits from the string before converting.
as.integer(str_extract(coal_long$year, "\\d+"))
Upvotes: 1