delete special characters from a text file after the first element in r

Question

I am trying to read a text file using read.table() in R. R does not read anything that follows a #. However, there are pound symbols in the text that have nothing to do with the comments. I want to delete the unwanted # symbols without adding the comments to the data frame.

Fortunately, all of the pound symbols that I want to keep are in the first element of each row. So basically I need to delete all # symbols that are not in the first element of the row.

2018-08-14 00:00:42 102.18.18.2  
2018-08-15 00:00:47 223.45.67.8    
2018-08-15 00:00:48 026.15.65.0    
2018-08-15 00:00:49 924.43.47.0    
2018-08-15 00:00:49 122.45.#67.9

I want to keep the pound symbol in the first line and delete the pound symbol in the last line that is causing problems in the data frame.

Achilles · Accepted Answer

You can do it using a feature in Regular Expression knows as capture groups.

Just open your file in an editor which supports finding text using RegEx, such as VS Code.

In the Find box, write: (.+)(#)

In the replace box, write: $1

Clicking Replace all should remove all your # characters in between texts.

Alternatively, you could also write a script to do this.

delete special characters from a text file after the first element in r

Answers (2)

MWE

Solution

Related Questions