Antti
Antti

Reputation: 1293

Two seemingly equal ways of changing column names for R data.frame - only the other one works

I have a dataframe and I need to add suffix to some of the variable names. In my case that's all numerical variables after spreading a variable to wide format. Could someone explain me why the first option does not work but the second does:

df <- data.frame(ID = "id", var1 = 1, var2 = 2, var3 = 3) 

1.

colnames(df[,2:ncol(df)]) <- paste0(names(df[,2:ncol(df)]), "_X")

2.

colnames(df) <- c("ID", paste0(names(df[,2:ncol(df)]), "_X"))

Upvotes: 0

Views: 346

Answers (2)

Roman Luštrik
Roman Luštrik

Reputation: 70643

You are subsetting your df, in essence creating a second data.frame, and renaming it. This does not reflect on your original data.frame.

colnames(df[,2:ncol(df)]) <- paste0(names(df[, 2:ncol(df)]), "_X")

would be equal to

df2 <- df[,2:ncol(df)]
colnames(df2) <- paste0(names(df[, 2:ncol(df)]), "_X")

> df
  ID var1 var2 var3
1 id    1    2    3
> df2
  var1_X var2_X var3_X
1      1      2      3

The correct way would be

colnames(df)[2:ncol(df)] <- paste0(names(df[, 2:ncol(df)]), "_X")

or using sprintf

colnames(df)[2:ncol(df)] <- sprintf("%s_X", names(df)[2:ncol(df)])

Upvotes: 1

Roland
Roland

Reputation: 132706

Your fist command contains syntax errors. We can fix it to:

colnames(df[,2:ncol(df)]) <- paste0(names(df[,2:ncol(df)]), "_X")

That doesn't return an error, but still doesn't work. You assign column names to a subset of a data.frame, but that subset is never stored and the command doesn't change the names of the full data.frame.

You need to assign to a subset of the names:

colnames(df)[2:ncol(df)] <- paste0(names(df)[2:ncol(df)], "_X")

Upvotes: 2

Related Questions