icedcoffee
icedcoffee

Reputation: 1015

How to remove a mystery character from column header in R?

I have a mystery character in my dataframe in R:

df <- structure(list(`ID21` = c("23", "44"),
ID22 = c("53", "23"), `Drug-na�ve_D22` = c("53",
"45")), row.names = 1:2, class = "data.frame")

> df
  ID21 ID22 Drug-na�ve_D22
1   23   53             53
2   44   23             45

What's the best way to remove this character? Would some sort of gsub with regular expression work? In this example I've replaced it with the letter i:

> df
  ID21 ID22 Drug-naive_D22
1   23   53             53
2   44   23             45

Upvotes: 0

Views: 45

Answers (2)

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

To match any non-ASCII character you can use this pattern:

[^ -~]

So, for example, if you want to replace the char by i, you can use sub thus:

sub("[^ -~]", "i", names(df)) 

Upvotes: 1

stlba
stlba

Reputation: 767

To remove any non-word characters (letters, numbers and underscore) in your column names

names(df) <- gsub("\\W", "", names(df))

If you want to replace the characters with a different character, put them in the second argument

Upvotes: 2

Related Questions