LA_
LA_

Reputation: 20409

How to assign column names with fread in R?

I have the following code -

zz3 <- 'data,key
"VA1,VA2,20140524,,0,0,5969,20140523134902,S7,S1147,140,20140523134902,m/t",4503632376496128
"VA2,VA3,20140711,,0,0,8824,20140601095714,S1,S6402,175,20140601095839,m/t",4503643113914368
"VA1,VA3,20140710,,0,0,11678,20140604085203,S1,S1430,250,20140604085329,m/t",4503666467799040
"VA2,VA1,20140724,,0,0,7109,20140523133835,S7,S793,130,20140523133835,m/t",4503679218483200
"VA3,VA1,20140925,,0,0,10592,20140604092548,S7,S109,395,20140604092714,m/t",4503694653521920'

columnClasses <- c("or"="factor", "d"="factor", "ddate"="factor", "rdate"="factor", "changes"="integer", "class"="factor", "price"="integer", "fdate"="factor", "company"="factor", "number"="factor", "dur"="integer", "added"="factor", "source"="factor", "key"="NULL") # skip last column "key"
data <- fread(zz3, header = FALSE, sep = ",", skip = 1, na.strings = c(""), colClasses = columnClasses)

But it returns an error -

Error in fread(zz3, header = FALSE, sep = ",", skip = 1, na.strings = c(""),  : 
  Column name 'or' in colClasses[[1]] not found

I expected that colClasses assigns column names, when header = FALSE, but looks like it is not the case.

How should I fix that? Similar read.csv code worked well.

Upvotes: 6

Views: 11410

Answers (2)

Michal
Michal

Reputation: 1905

You should be separating it into column names and column classes

Setting the column names should be done in a separate step.

column_names <-c("or", "d", "ddate", "rdate", "changes", "class", "price", "fdate", "company", "number", "dur", "added", "source", "key") 
column_classes <- c("factor", "factor", "factor", "factor", "integer", "factor", "integer", "factor", "factor", "factor", "integer", "factor", "factor", "NULL") 

data <- fread(zz3, header = FALSE, sep = ",", skip = 1, na.strings = c(""), colClasses = column_classes)
setnames(data, column_names)

Upvotes: 4

Colonel Beauvel
Colonel Beauvel

Reputation: 31171

It is indeed not the case.

colClasses enables you to define to column types by using fread. Suppose you have file splitted by | with a column named 'key' and you want it to be a character, you will run the command: fread(filePath, sep='|', colClasses=c(key='character')).

If you have no names in the file you can use setnames to assign column names to your data.table once it is read.

Upvotes: 5

Related Questions