Houssam Baiz
Houssam Baiz

Reputation: 43

I want to change the columns names with a loop

i have a datasets column names looking like that

state.abb, state.area, state.division, state.region

i want to change the names of the columns and delete the "state." part to leave only "abb", "area","division", and "region". i wrote this code using a loop over the df columns using substr func but it doesn't work nor give errors. what's wrong with it please ?


    for(e in 1:ncol(df)){
      colnames(df[e])<-substring(colnames(df[e]),7)
    }

Upvotes: 2

Views: 580

Answers (2)

rg255
rg255

Reputation: 4169

As an alternative solution, you could use gsub() to replace all the "state." with nothing (""), here showing that with just a vector:

gsub("state.", "", c("state.abb", "state.area", "state.division", "state.region"))

To replace the colnames names:

colnames(df) <- gsub("state.", "", colnames(df))

As a bonus, imagine you want to replace a word or string that occurs in some but not all of your columns. Taking the built in iris dataset as an example, you could replace "Petal" with "P" for the columns where "Petal" is in the column name with the exact same approach:

colnames(iris) <- gsub("Petal", "P", colnames(iris))

I wouldn't bother with a for loop for this job, it's far easier to use a vectorised approach. But to explain your error, when you did colnames(df[1]) you were returning the column name of a single column dataframe that you had isolated from your main dataframe, rather than handling the main dataframe itself. For example, iris[1] returns a dataframe with one column - see str(iris[1]) - so colnames(iris[1]) returns the column name of that isolate. A slight change instead allows you to return (and then change) the 1st element of the vector of column names for iris: colnames(iris)[1].

Upvotes: 4

akrun
akrun

Reputation: 887048

Here, we can change the colnames(df[e]) to colnames(df)[e]

for(e in seq_along(df)){
     colnames(df)[e] <- substring(colnames(df)[e],7)
}

substring is vectorized so we could directly do this without any for loop

colnames(df) <- substring(colnames(df), 7)

Also, if we are removing the prefix including the ., a generalized option assuming that the prefix can be of any length is sub

colnames(df) <- sub(".*\\.", "", colnames(df))

An an example,

data(mtcars)
colnames(mtcars[1]) <- "hello"
colnames(mtcars[1])
#[1] "mpg" # no change
colnames(mtcars)[1] <- "hello"
colnames(mtcars[1])
#[1] "hello" # changed

Upvotes: 4

Related Questions