Boosted_d16
Boosted_d16

Reputation: 14112

Remove character starting with x and ending with x and everything between it in a string

The problem: I want to remove a specific code within a question. The code changes position from question to question so I cannot rely on the position of the code to remove it.

Here is what it looks like:

Now thinking specifically about the home improvement brand _Everest<.br/>On a scale of 0 to 10, where 0 is "Not at all familiar/ knowledgeable" and 10 is "Very familiar/ knowledgeable", how familiar / knowledgeable do you consider yourself to be with..."

The code - <.br> - is always attached to the word before and after.

Solution: I would like to know how, if there is a function, to delete/remove a set of characters which start with x and end with x and removes everything between it.

I hope this makes sense.

Upvotes: 0

Views: 345

Answers (1)

eumiro
eumiro

Reputation: 213025

import re

def remove_between_anchors(text, anchor):
    return re.sub(r'{0}.+?{0}'.format(anchor), '', text)

remove_between_anchors('123aa456aa789', 'aa') # returns '123789'

EDIT: if the start/end anchors are different:

def remove_between_anchors(text, start, end):
    return re.sub(r'{0}.+?{1}'.format(start, end), '', text)

remove_between_anchors('123<abc>456', '<', '>') # returns '123456'

Upvotes: 2

Related Questions