Josh Rumbut
Josh Rumbut

Reputation: 2710

Automatically import column names in R from names file

I've been working with datasets from the UCI Machine Learning Repository. Some of the datasets, like this one, contain a file with the extension .c45-names that looks machine readable.

Is there a way to use this data to automatically name the columns in the data frame, or even better to also use the other metadata like data types or possible values for discrete variables?

Currently, I'm copy/pasting column names into a line of code like this:

names(cars) = c('buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety', 'rating')

It would nice if there was something more automated, Google searches have been ineffective so far since there is a similarly named classification algorithm that's been implemented in R.

Upvotes: 1

Views: 285

Answers (1)

Adam Quek
Adam Quek

Reputation: 7163

car.c45_names <- readLines("https://archive.ics.uci.edu/ml/machine-learning-databases/car/car.c45-names")
tmp <- car.c45_names[grep(":", car.c45_names)] #grab lines containing ":"
colname_car.c45 <- sub(':.*', '', tmp) #replace all characters after ":" with ""; thanks to alistaire's for pointing out     
# colname_car.c45 <- sapply(tmp, function(x)substring(x, 1, gregexpr(":", x)[[1]]-1)) 
cars <- setNames(cars, colname_car.c45) #same as 'names(cars) <- colname_car.c45'

Upvotes: 1

Related Questions