Reputation: 53
I have a problem with R and I don't know how to solve it. I am working with a database of 47228 observations and 68 columns of an excel sheet. The case is that I have 7 columns in which in the excel 47719 observations of 47728 (99.99%) appear as NA and the other cases as dates, and surprisingly when I import the base with the read_excel function of the readxl package, R import it as 100% as NA. In other words, in those 9 observations that are dates, R doesn't show the reliable content.
The problem is that I need to be able to analyze the database including that data and I cannot, because it does not appear.
As it is a problem that arises with the import, I am trying to explain to R that those 7 columns process them as dates, but I can't do it. The read_excel function has an argument which is col_types but I have to fill it with a vector that specifies the type of the 68 columns, and what I am looking for is to simply specify those 8 columns. Does anyone know how to do it? I saw that there is a function called read_table from the readr package that allows on import to specify the class of only the variables that interest me, with the argument cols_only =, but for some reason when I execute that function R aborts the session with a warning "R encountered a fatal error."
Does anyone know what this could be? Greetings and thank you very much. Sorry for my english.
#Code with which R imports those 7 columns as 100% NA
library(readxl)
base_neg<- read_excel("Data/Bases/Canonica/Listado 2012_2014 NEGATIVAS.xls")
#Code with which I can supposedly achieve what I want but R collapses, specifying only 1 of the 7 variables needed.
library(readr)
base_neg<- read_table("Data/Bases/Canonica/Listado 2012_2014 NEGATIVAS.xls",
col_types = cols_only(fecha_colpo_nav = "D"))```
Upvotes: 0
Views: 754
Reputation: 10637
Start by importing everything as characters. Then, you can use readr::type_convert
to parse some columns afterwards. The argument .default
will be applied to all other columns.
library(readxl)
library(readr)
data <- read_excel("file.xlsx", col_types = "text")
data
# A tibble: 2 x 4
# col1 col2 col3 col4
# <chr> <chr> <chr> <chr>
# 1 Foo 44542 1 TRUE
# 2 bar 44541 2 FALSE
# set the type to all other columns to character
type_convert(data, col_types = cols(col4 = col_logical(), .default = col_character()))
# A tibble: 2 x 4
# col1 col2 col3 col4
# <chr> <chr> <chr> <lgl>
# 1 Foo 44542 1 TRUE
# 2 bar 44541 2 FALSE
# guess the type of all other columns
# but enforce col4 to be logical
type_convert(data, col_types = cols(col4 = col_logical()))
# A tibble: 2 x 4
# col1 col2 col3 col4
# <chr> <dbl> <dbl> <lgl>
# 1 Foo 44542 1 TRUE
# 2 bar 44541 2 FALSE
Upvotes: 1