bluetree8173
bluetree8173

Reputation: 59

Python Regex - Replacing Non-Alphanumeric Characters AND Spaces with Dash

I am trying to replace all of the non-alphanumeric characters AND spaces in the following Python string with a dash -. I tried to use the below code, but it only replaced the non-alphanumeric characters with a dash - and not the spaces.

s = re.sub('[^0-9a-zA-Z]+', '-', s)

Original String: s = 'ABCDE : CE ; CUSTOMER : Account Number; New Sales'

How can Python regex be used to replace both the non-alphanumeric characters AND spaces with a dash - to get the following target outcome?

Target Outcome: s = 'ABCDE---CE---CUSTOMER---Account-Number--New-Sales'

Upvotes: 1

Views: 3381

Answers (5)

Ajax1234
Ajax1234

Reputation: 71451

You can use [\W_]:

import re
d = re.sub('[\W_]', '-', s)

Output:

'ABCDE---CE---CUSTOMER---Account-Number--New-Sales'

Upvotes: 0

Trenton McKinney
Trenton McKinney

Reputation: 62393

Without re:

s = 'ABCDE : CE ; CUSTOMER : Account Number; New Sales'

''.join(x if x.isalnum() else '-' for x in s)

Output:

'ABCDE---CE---CUSTOMER---Account-Number--New-Sales'

Upvotes: 1

android.weasel
android.weasel

Reputation: 3391

I see spaces translated properly, but your regexp should omit the +

import re
s = 'ABCDE : CE ; CUSTOMER : Account Number; New Sales'
re.sub('[^0-9a-zA-Z]+', '-', s)

I'm on my phone, but pasting that into https://repl.it/languages/python3 gives me

ABCDE-CE-CUSTOMER-Account-Number-New-Sales

as expected - spaces translated.

If you want the multiple - characters, lose the + in your regexp:

import re
s = 'ABCDE : CE ; CUSTOMER : Account Number; New Sales'
re.sub('[^0-9a-zA-Z]', '-', s)

Gives

ABCDE---CE---CUSTOMER---Account-Number--New-Sales

Upvotes: 1

Austin
Austin

Reputation: 26039

You were very close. You just don't need the + , because then that would would replace multiple occurances with just one dash.

You need:

re.sub('[^0-9a-zA-Z]', '-', s)

Example:

import re

s = 'ABCDE : CE ; CUSTOMER : Account Number; New Sales'

print(re.sub('[^0-9a-zA-Z]', '-', s))
# ABCDE---CE---CUSTOMER---Account-Number--New-Sales

Upvotes: 2

TennisTechBoy
TennisTechBoy

Reputation: 103

import re
s='ABCDE : CE ; CUSTOMER : Account Number; New Sales'
s = re.sub(r'\W', '-', s)

Hope this helps.

Regards Aditya Shukla

Upvotes: 0

Related Questions