Reputation: 311
I have a number of data files that I am reading into R as CSVs. I would like to specify the colClasses of certain columns in these data files, but the lengths of the dataframes are unknown as they contain species abundance data (hence, different numbers of species).
Is there a way that I can set, say, every column after the first 10 to numeric (so, ncol[10]:length(df)) using colClasses in read.csv?
This is what I tried, but to no avail:
df <- read.csv("file.csv", header=T, colClasses=c(ncols[10], rep("numeric", ncols)))
Any help would be greatly appreciated.
Thanks, Paul
Upvotes: 3
Views: 2969
Reputation: 193517
I would start with using count.fields
to determine how many columns there are in the data. You can do this just on the first line.
Then, from there, you can use rep
for your colClasses
.
It's fugly, but works. Here's an example:
The first few lines are just to create a dummy csv file in your workspace since you didn't provide a reproducible example.
X <- tempfile()
cat("A,B,C,D,E,F",
"1,2,3,4,5,6",
"6,5,4,3,2,1", sep = "\n", file = X)
This is where the actual answer starts. Replace "x" with your actual file name in both places below. The -2
is because we have two columns that are already accounted for.
Y <- read.csv(X, colClasses = c(
"numeric", "numeric", rep("character", count.fields(textConnection(
readLines(X, n=1)), sep=",")-2)))
# Y <- read.csv("file.csv", colClasses = c(
# "numeric", "numeric", rep(
# "character", count.fields(readLines(
# "file.csv", n = 1), sep = ",")-2)))
str(Y)
# 'data.frame': 2 obs. of 6 variables:
# $ A: num 1 6
# $ B: num 2 5
# $ C: chr "3" "4"
# $ D: chr "4" "3"
# $ E: chr "5" "2"
# $ F: chr "6" "1"
Upvotes: 1