Adele
Adele

Reputation: 51

Python REGEX, reformat a string

I'm trying to create a regex that will take a string and replace certain characters

  1. Double or more spaces reduces to one space
  2. The following chars will be replaced by a word: "#" -> "number, "@" -> "at"
  3. Spaces will be replaced with "-", unless its at the end of the string
  4. Contains only a-z, A-Z, 0-9 and: !@#$%&/,
  5. Double or more "-" will reduce to one
"Hello, Wor--ld! 1$2@3-   " -> "hello-wor-ld-1-dollars-2-at-3"

My code:

name = "Hello, World! 1$2@3-   "

name = re.sub("[^a-zA-Z0-9]+","-",name.lower())

print(name)

But it results in "hello-world-1-2-3-"

Upvotes: 0

Views: 108

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

Here is the code that you may use as a basis to solve your issue:

import re
name = "Hello, World! 1$2@3-   "
name = re.sub("[^a-zA-Z0-9@#$&]+", "-", " ".join(name.lower().split()))
dct = {'#': 'number', '@': 'at', '$': 'dollars', '&': 'and'}
name = re.sub(r'[$@#]', lambda x: f"-{dct[x.group()]}-", name)
print(name.strip('-'))
# => hello-world-1-dollars-2-at-3

See the Python demo.

Notes:

  • " ".join(name.lower().split()) - removes leading/trailing whitespaces, shrinks multiple whitespaces to a single occurrence between words and splits with whitespace
  • re.sub("[^a-zA-Z0-9@#$&]+", "-", ...) - replaces all one or more consecutive chars other than alphanumeric, #, @, $ and & chars with a hyphen
  • re.sub(r'[$@#]', lambda x: f"-{dct[x.group()]}-", name) - replaces specified special chars with words
  • name.strip('-') removes leading/trailing hyphens.

Upvotes: 1

Related Questions