Automatically import column names in R from names file

Question

I've been working with datasets from the UCI Machine Learning Repository. Some of the datasets, like this one, contain a file with the extension .c45-names that looks machine readable.

Is there a way to use this data to automatically name the columns in the data frame, or even better to also use the other metadata like data types or possible values for discrete variables?

Currently, I'm copy/pasting column names into a line of code like this:

names(cars) = c('buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety', 'rating')

It would nice if there was something more automated, Google searches have been ineffective so far since there is a similarly named classification algorithm that's been implemented in R.

Adam Quek · Accepted Answer

car.c45_names <- readLines("https://archive.ics.uci.edu/ml/machine-learning-databases/car/car.c45-names")
tmp <- car.c45_names[grep(":", car.c45_names)] #grab lines containing ":"
colname_car.c45 <- sub(':.*', '', tmp) #replace all characters after ":" with ""; thanks to alistaire's for pointing out     
# colname_car.c45 <- sapply(tmp, function(x)substring(x, 1, gregexpr(":", x)[[1]]-1)) 
cars <- setNames(cars, colname_car.c45) #same as 'names(cars) <- colname_car.c45'

Automatically import column names in R from names file

Answers (1)

Related Questions