Gravity Mass
Gravity Mass

Reputation: 605

How to remove a substring after a substring?

I have the following string:

somestring = "Zero function argument 1 since code 345 and from code 8476 then it goes on"

I want to remove since code 345 and from code 8476 (those numbers can vary) but I want to keep the 1 in argument 1 in the string.

I am doing the following:

import re

somestring = "Zero function argument 1 since code 345 and from code 8476 then it goes on"
somestring = somestring.replace("since code", "").replace("from code", "")
stringlist = somestring.split(" ")
pattern = '[0-9]'
print([re.sub(pattern, "", i) for i in stringlist])

But the output removes the 1 from argument 1 in the string. The output looks like this: ['Zero', 'function', 'argument', '', '', '', 'and', '', '', 'then', 'it', 'goes', 'on']

But the ideal output I want is Zero function argument 1 and then it goes on i.e. remove any number after since code and from code, remove "since code" and "from code", and do not have '' in the string or the list.

How can this be done?

Upvotes: 0

Views: 59

Answers (2)

Grismar
Grismar

Reputation: 31329

I think this is what you're after:

import re

somestring = "Zero function argument 1 since code 345 and from code 8476 then it goes on"
result = re.sub(r'(?:since|from) code \d+ ', '', somestring)
print(result)

Note that there's a space after \d+, since the string you're replacing would have have space before it and one after it, in your example. If the phrase since code 123 could also appear in quotes, or before a comma or period for example, then you might want something like this:

import re

somestring = "Zero function argument 1 since code 345, and from code 8476 as well. Then it goes on"
result = re.sub(r'\s*(?:since|from) code \d+\s*', ' ', somestring)
print(result)

(which is very similar to what @TimBiegeleisen posted)

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521249

I would use a regex replacement here:

somestring = "Zero function argument 1 since code 345 and from code 8476 then it goes on"
output = re.sub(r'\s*\b(?:since|from) code \d+\s*', ' ', somestring).strip()
print(output)  # Zero function argument 1 and then it goes on

Upvotes: 1

Related Questions