JCHEN
JCHEN

Reputation: 11

unnest() error: "can't be converted from numeric to character"

I used map()and read_xlsx() to extract some data (from specific range in excel file) from 36 excel xlsx files into one file. Basically, I tweaked the codes from the blog.(https://serialmentor.com/blog/2016/6/13/reading-and-combining-many-tidy-data-files-in-R, part:Keeping auxilliary information about the files read).

It worked very well. I can get a nested dataframe. But the unnest() step did not work. I found out the reason is that the unnest() can not handle mixture of the different types. One column is number type in most lists, but the column is character type in one list. so unnest() could not work.

Is it possible I can let unnest() to convert all numbers in every file to characters so the unnest will work? if no way we can let unnest do it, are there other methods to unnest the data?

the system is windows, R version 3.5 other attached packages version:

[1] readxl_1.3.0    forcats_0.4.0   stringr_1.4.0   dplyr_0.8.0.1   
purrr_0.3.1    
[6] readr_1.3.1     tidyr_0.8.3     tibble_2.0.1    ggplot2_3.1.0   
tidyverse_1.2.1

after map() and read_xlsx():

the code:

file_path <- "files" 
files_name <- dir(path = file_path, pattern = "*.xlsx", all.files =FALSE)
data <- data_frame(filename = files_name) %>% # create a data frame
 # holding the file names
  mutate(file_contents = map(filename,          # read files into
                         ~ read_xlsx(file.path(file_path, .), range = 
"A56:J60")) # a new data column
 ) '

the data is:

> data
># A tibble: 36 x 2
>  filename                   file_contents    
 >  <chr>                      <list>           
>1 JC04Feb19Run003.xlsx       <tibble [4 x 10]>
>2 JC04Feb19Run004.xlsx       <tibble [4 x 10]>
>3 JC06Feb19Run005.xlsx       <tibble [4 x 10]>
>4 JC06Feb19Run006.xlsx       <tibble [4 x 10]>
>5 JC07Feb2019Run007.xlsx     <tibble [4 x 10]>
>6 JC07Feb2019Run008.xlsx     <tibble [4 x 10]>
>7 JC11Feb2019Run009.xlsx     <tibble [4 x 10]>
>8 JC12Feb2019Run011.xlsx     <tibble [4 x 10]>
>9 JC12Feb2019Run012.xlsx     <tibble [4 x 10]>
>10 JC13Feb2019Run015 (2).xlsx <tibble [4 x 10]>
># ... with 26 more rows

after unnest(data)

> unnest(data)
Error: Column `CON` can't be coverted from numeric to character

after str(data):

..$ :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':  4 obs. of  10 
variables:
.. ..$ Sample   : chr  "LQC-1" NA "LQC-2" NA
.. ..$ Wells    : chr  "H3" "H4" "C9" "C10"
.. ..$ NOM pg/mL: num  50 NA 50 NA
.. ..$ SIGNAL   : num  256 257 314 213
.. ..$ CON      : num  98.8 99.1 112.5 87.7
.. ..$ XCON     : num  99 NA 100 NA
.. ..$ SDCON    : num  0.18 NA 17.5 NA
.. ..$ CVCON    : num  0.2 NA 17.5 NA
.. ..$ %REC     : num  198 NA 200 NA
.. ..$ STATUS   : chr  "FAIL" NA "FAIL" NA
..$ :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':  4 obs. of  10 
variables:
.. ..$ Sample   : chr  "LQC-1" NA "LQC-2" NA
.. ..$ Wells    : chr  "H3" "H4" "C9" "C10"
 . ..$ NOM pg/mL: num  50 NA 50 NA
.. ..$ SIGNAL   : num  878 314 9 222
.. ..$ CON      : chr  "132.69999999999999" "48.68" "Range?" 
"34.979999999999997"
.. ..$ XCON     : num  90.7 NA 35 NA
.. ..$ SDCON    : num  59.4 NA 0 NA
.. ..$ CVCON    : num  65.5 NA 0 NA
.. ..$ %REC     : num  181 NA 70 NA
.. ..$ STATUS   : chr  "FAIL" NA "PASS" NA

it showed that the CON Column is num, in 36 columns, while in one column is chr.

Upvotes: 1

Views: 1701

Answers (1)

CuriousBeing
CuriousBeing

Reputation: 1632

Use colClasses = 'character' in read_xlsx and this should work.

or

col_types = 'text' 

Upvotes: 1

Related Questions