standardize column names - search for a set of column names and then use the one available for the new standardized columns

Question

I am preparing and standardizing data. Part of that standardizing column names. I use data.table.

What I want is this: I want to create a new standardized column name (self-defined) and set my code so that it searches a specificed vector of colnames in the original data and if it find any of these colmns then use that to fill in the standardized column name.

I appreciate it might not be clear so here is an example. In teh belwo, I want to create new standardized column name WEIGHT. I want to seach colnames in dat for any of these (wt,WT,WTBL) and if it finds one of them then use it for the new column WEIGHT

library(data.table)
library(car)
dat <- as.data.table(mtcars)

dat[, WEIGHT := wt]  #this is what we do normally - but i want to make it semiautomatic so that i search for a vector of column names and use the one that is avalble to fill in the WEIGHT columes.

dat[, WEIGHT := colnames(dat%in%c('wt','WT','WTBL'))] #this is wrong and there where i need help!

thelatemail · Accepted Answer

There's probably a simpler construction of this, but here's an attempt. The mget() attempts to grab each value in order, returning a NULL if not found.

Then the first non-NULL value is used to overwrite:

dat[, WEIGHT := {
    m <- mget(c("WTBL","wt","WT"), ifnotfound=list(NULL))
    m[!sapply(m, is.null)][1]
}]

standardize column names - search for a set of column names and then use the one available for the new standardized columns

Answers (1)

Related Questions