paanvaannd
paanvaannd

Reputation: 191

Specify multiple data types for colClasses in R

If a data file I want to analyze in R has multiple data types and I want to call colClasses to specify the data types expected for individual columns, how would I go about doing that? The sample file I am using is: http://www.cyclismo.org/tutorial/R/_static/trees91.csv

For example, when I type

tree <- read.csv("trees91.csv", header=T, sep=",", dec=".", colClasses=c(C,N,REP,LFBCC,STBCC,RTBCC="integer", CHBR="character", "double"), nrows=70)

I get the following error:

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : object 'N' not found

There are 28 columns overall and the columns with differing data types are interspersed throughout the file. For example, the first two columns are all integer values, whereas the third column has character values, and thus forth. What I want to do is specify which columns contain integer values (columns C,N,REP,LFBCC,STBCC, and RTBCC), the one that has character values (CHBR), and specify that the rest of the columns contain decimal values.

I realize that in this instance, simply calling read.table would handle the job with no appreciable loss in speed but I am using this file to practice analyzing larger files in which using colClasses would be useful. I also realize that I could simply specify that the CHBR column is of the type "character" and leave R to set all other column types to the default type, but my goal is to empirically declare all column data types.

Upvotes: 0

Views: 1547

Answers (1)

Flavio Kaminishi
Flavio Kaminishi

Reputation: 63

You can specify using either one:

colClasses = c("integer", "integer", "character", "character")

or

colClasses = list(integer = 1:2, character = 3:4)

Upvotes: 1

Related Questions