user3501284
user3501284

Reputation: 3

Regex - Parse string by removing characters

I am trying to parse the path part of a url.

The input, is a string such as site/whatever% ^&*/page/to-days_date// which I would like to convert into site/whatever/page/to-days_date

Things to remove would be anything that is not one of the following:

  1. lower or upper case letter
  2. digit / number
  3. dash
  4. underscore

Upvotes: 0

Views: 107

Answers (1)

Sabuj Hassan
Sabuj Hassan

Reputation: 39365

Just add /+$ with a pipe(|) with your existing regex. It means match any number(starting from 1) of / from the end of input. So it will work for / // or ///// at the end of the input.

myString = '''blog/whatever%  ^&*/page/to-days_date//'''
print re.sub(r'/+$|[^a-zA-Z0-9_\-\/]+', '', myString)
               ^^^ here

Upvotes: 1

Related Questions