Reputation: 299
I'm trying to read a large file into R which is separated by the "not" sign (¬). What I normally do, is to change this symbol into semicolons using Text Edit, and save it as a csv file, but this file is too large, and my computer keeps crashing when I try to do so. I have tried the following options:
my_data <- read.delim("myfile.txt", header = TRUE, stringsAsFactors = FALSE, quote = "", sep = "\t")
which results in a dataframe with a single row. This makes sense, I know, since my file is not separated by tabs, but by the not sign. However, when I try so change sep to ¬ or \¬, I get the following message:
Error in scan(file, what = "", sep = sep, quote = quote, nlines = 1, quiet = TRUE, :
invalid 'sep' value: must be one byte
I have also tried with
my_data <- read.csv2(file.choose("myfile.txt"))
and
my_data <- read.table("myfile.txt", sep="\¬", quote="", comment.char="")
getting similar results. I have searched for options similar to mine, but his kind of separator is not commonly used.
Upvotes: 2
Views: 614
Reputation: 160687
You can try to read in a pipe
d translation of it.
Setup:
writeLines("a¬b¬c\n1¬2¬3\n", "quux.csv")
The work:
read.csv(pipe("tr '¬' ',' < quux.csv"))
# a b c
# 1 1 2 3
If commas don't work for you, this works equally well with other replacement chars:
read.table(pipe("tr '¬' '\t' < quux.csv"), header = TRUE)
# a b c
# 1 1 2 3
The tr
utility is available on all linuxes, it should be available on macos, and it is included in Rtools for windows (as well as git-bash, if you have that).
If there is an issue using pipe
, you can always use the tr
tool to create another file (replacing your text-editor step):
system2("tr", c("¬", ","), stdin="quux.csv", stdout="quux2.csv")
read.csv("quux2.csv")
# a b c
# 1 1 2 3
Upvotes: 4