InfiniteLoop
InfiniteLoop

Reputation: 431

Issues importing a csv in R

I'm trying to teach myself R (just started). I decided to import 2 csv files to practice a join on them.

One file imported just fine, the other one is giving off the following errors:

Here is the csv file link:

https://data.world/jonathankkizer/occupation-computerization

I used the following statement

occupationforjoin<-read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
header=TRUE, sep=",")

Warning messages: 1: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 1 appears to contain embedded nulls 2: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 2 appears to contain embedded nulls 3: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 3 appears to contain embedded nulls 4: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 4 appears to contain embedded nulls 5: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 5 appears to contain embedded nulls 6: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : EOF within quoted string 7: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : embedded nul(s) found in input

I found on StackOverflow that it could be due to encoding, so I used the suggested solution and executed the statement

occupationforjoin<-read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
header=TRUE, sep=",", fileEncoding="UTF-16LE")

It gave me a different error message:

Error in read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : more columns than column names

I also tried using the read.csv function to no avail.

How do I fix this problem and import the data set successfully? None of the solutions (e.g., using "skipNul = TRUE", "comment.char="" " parameters) that I found online helped.

UPD: Here's the paste of the data set if you don't want to download the csv file from the data world: https://pastebin.com/SPEtWT6f

Upvotes: 1

Views: 2414

Answers (3)

InfiniteLoop
InfiniteLoop

Reputation: 431

I finally found the solution! I was going nuts; even my instructor didn't know how to fix it!

This statement works:

o<-read.csv("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/Occ.txt", header=T, sep="\t", fileEncoding="UTF-16LE")

Like I said in my original question: I tried using fileEncoding="UTF-16LE" and it didn't help. After asking the question, I tried using sep="\t", and it didn't help. But using both of them did the trick!

Upvotes: 3

Zeeshan
Zeeshan

Reputation: 1238

Use dataframe = read.csv("name_of_file.csv")

or

dataframe = read.csv(file.choose()).

Hope this will work.

Upvotes: 1

Jeff Li
Jeff Li

Reputation: 21

Try to use the function of read_csv() from the readr package.

Upvotes: 2

Related Questions