Juliana Gómez
Juliana Gómez

Reputation: 299

Reading txt file into R, which is separated by "not" sign (¬)

I'm trying to read a large file into R which is separated by the "not" sign (¬). What I normally do, is to change this symbol into semicolons using Text Edit, and save it as a csv file, but this file is too large, and my computer keeps crashing when I try to do so. I have tried the following options:

my_data <- read.delim("myfile.txt", header = TRUE, stringsAsFactors = FALSE, quote = "", sep = "\t")

which results in a dataframe with a single row. This makes sense, I know, since my file is not separated by tabs, but by the not sign. However, when I try so change sep to ¬ or \¬, I get the following message:

Error in scan(file, what = "", sep = sep, quote = quote, nlines = 1, quiet = TRUE,  : 
  invalid 'sep' value: must be one byte

I have also tried with

my_data <- read.csv2(file.choose("myfile.txt"))

and

my_data <- read.table("myfile.txt", sep="\¬", quote="", comment.char="")

getting similar results. I have searched for options similar to mine, but his kind of separator is not commonly used.

Upvotes: 2

Views: 614

Answers (1)

r2evans
r2evans

Reputation: 160687

You can try to read in a piped translation of it.

Setup:

writeLines("a¬b¬c\n1¬2¬3\n", "quux.csv")

The work:

read.csv(pipe("tr '¬' ','  < quux.csv"))
#   a b c
# 1 1 2 3

If commas don't work for you, this works equally well with other replacement chars:

read.table(pipe("tr '¬' '\t'  < quux.csv"), header = TRUE)
#   a b c
# 1 1 2 3

The tr utility is available on all linuxes, it should be available on macos, and it is included in Rtools for windows (as well as git-bash, if you have that).

If there is an issue using pipe, you can always use the tr tool to create another file (replacing your text-editor step):

system2("tr", c("¬", ","), stdin="quux.csv", stdout="quux2.csv")
read.csv("quux2.csv")
#   a b c
# 1 1 2 3

Upvotes: 4

Related Questions