JRR
JRR

Reputation: 588

readr read_fwf strange parsing error: embedded null

I am trying to use readr::read_fwf to read-in a .txt file. I know all of the column widths but I receive this parsing error which I do not know how to resolve:

 fwf_widths <- c(12, 2, 6, ...) 
 fwf_names <- c("V1", "V2", "V3", ...)
 col_types <- c("ccc...")

 df <- read_fwf(file = file, fwf_widths(fwf_widths, fwf_names), 
                         col_types = col_types)
Warning: 1 parsing failure.
row         col expected        actual        file                                                                         

372722 description          embedded null     /path/to/my/file.txt

I've tried adding trim_ws = T which does not get rid of the error. I looked at the actual contents of df[372722, ] and it looks like description contains the correct contents. Can someone please help me interpret what embedded null means and how I can potentially deal with this issue?

Upvotes: 0

Views: 577

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 174616

One of the bytes in your fwf is a zero-value byte, which is illegal in an R character string. If you just remove it you will destroy the alignment of the subsequent entries in the fwf, so you need to replace it. The following function will write a space character by default at any zero byte locations.

Please back up your .fwf file before using this.

replace_null <- function(path_to_file, file_size = 10000000L, replace_with = ' ')
{
  file_data <- readBin(path_to_file, "raw", file_size)
  file_data[file_data == as.raw(0)] <- as.raw(as.numeric(charToRaw(replace_with)))
  writeBin(file_data, path_to_file)
}

Now you just need to do

replace_null(file_path)

and then your own code should work. If it doesn't, your fwf must be corrupted.

Upvotes: 0

Related Questions