Markus
Markus

Reputation: 1

read.csv error, can't separate the first row

Tried to import a csv file in the zip file from https://archive.ics.uci.edu/dataset/697/predict+students+dropout+and+academic+success

using

df.raw=read.csv("data.csv",sep=";")

but it keeps returning the error

Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names

I've tried

df.raw=read.csv("data.csv",sep=";",header = FALSE)

but it just returns the first row as

"Marital status;Application mode;Application order;Course;"Daytime/evening attendance\t";Previous qualification;Previous qualification (grade);Nacionality;Mother's qualification;Father's qualification;Mother's occupation;Father's occupation;Admission grade;Displaced;Educational special needs;Debtor;Tuition fees up to date;Gender;Scholarship holder;Age at enrollment;International;Curricular units 1st sem (credited);Curricular units 1st sem (enrolled);Curricular units 1st sem (evaluations);Curricular units 1st sem (approved);Curricular units 1st sem (grade);Curricular units 1st sem (without evaluations);Curricular units 2nd sem (credited);Curricular units 2nd sem (enrolled);Curricular units 2nd sem (evaluations);Curricular units 2nd sem (approved);Curricular units 2nd sem (grade);Curricular units 2nd sem (without evaluations);Unemployment rate;Inflation rate;GDP;Target;"

As one singular cell of data

Upvotes: 0

Views: 56

Answers (1)

Keyvan
Keyvan

Reputation: 23

I checked you issue, but I'm unfortunately unable to reproduce your error, as your data and code are running smooth on my side (using R version 4.2.1).

However I guess that your problem might come from the extra quotes on the "Daytime/evening attendance\t" column name from the csv.file, which lead your version of R to mishandle the column names ? I don't have a best answer than try to to remove/change that directly in the csv file, as I can't reproduce your issue.

Best luck !

Upvotes: 1

Related Questions