Harry Tong
Harry Tong

Reputation: 259

How to clean up a string excluding certain characters

I want to clean up the below string but only get rid of the \n, \r and extra spaces but not the apostrophe(') and other characters like dash(-) and colon(:).

Right now I am using this code but it gets rid of all special characters.

string = "\n\n\r\n            Scott Hibb's Amazing Whisky Grilled Baby Back Ribs\r\n                \n\n\n\n"
rx = re.compile('\W+')
string = rx.sub(' ', string).strip()
print(string)

How can i do this?

Upvotes: 0

Views: 443

Answers (3)

Joe Linoff
Joe Linoff

Reputation: 771

The accepted answer is great but if you would like a slightly more general solution that allows you to specify the explicit set of characters that you still want to remove, add a lambda function to the filter, something like this.

>>> y = "\n\n\r\n       Scott Hibb's       Amazing    Whisky Grilled Baby Back Ribs\r\n                \n\n\n\n"
>>> ' '.join(filter(lambda x: x not in ['\n', '\r'], y).strip().split())
"Scott Hibb's Amazing Whisky Grilled Baby Back Ribs"

Please note that for your example, explicitly specifying the \n and \r in the lambda is overkill because strip() treats \n and \r as whitespace but if you wanted to remove other characters, then this a reasonable approach. For example this is how you would strip extra white space characters, remove the \n and \r, and remove all standard vowels (a, e, i, o, u).

>>> y = "\n\n\r\n       Scott Hibb's       Amazing    Whisky Grilled Baby Back Ribs\r\n                \n\n\n\n"
>>> ' '.join(filter(lambda x: x.lower() not in ['a', 'e', 'i', 'o', 'u', '\r'], y).strip().split())
"Sctt Hbb's mzng Whsky Grlld Bby Bck Rbs"

Upvotes: 1

Satish Prakash Garg
Satish Prakash Garg

Reputation: 2233

You can use filter() and strip() to remove \n, \t, \r and extra whitespaces while preserving rest of the characters, something like this :

string = "\n\n\r\n       Scott Hibb's       Amazing    Whisky Grilled Baby Back Ribs\r\n                \n\n\n\n"
print(' '.join(filter(None, string.strip().split()))) 

This will result in :

Scott Hibb's Amazing Whisky Grilled Baby Back Ribs

Upvotes: 2

Steve Harris
Steve Harris

Reputation: 306

Use a character class, like [abc] matches a, b, or c

Upvotes: 0

Related Questions