Reputation: 31
I need to replace parts of the names stored in a JSON file, for example replacing this:
"name":"S. tuberosum subsp. andigenum (ADG) 2-1-2-2"
with this:
"name":"S. tuberosum subsp. andigenum (ADG)"
i.e. I need to eliminate the numbers and hyphens following the name.
I am using re.sub
but I can't figure out the right expressions, especially how to replace the string with a part of it.
I have tried this:
new_text = re.sub(r"(name.[:]..*)\s\d+-+", "name.[:]..*" , initial_text)
Upvotes: 0
Views: 852
Reputation: 71451
You can try this:
import re
s = '"name":"S. tuberosum subsp. andigenum (ADG) 2-1-2-2"'
new_s = re.sub('(?<=[A-Z]\))\s[\d-]+', '', s)
Output:
'"name":"S. tuberosum subsp. andigenum (ADG)"'
Upvotes: 0
Reputation: 1628
try this:
re.sub("(\d+-\d+-*)", "" , initial_text)
this will replace 'number-number-(optional)' , hope it works
Upvotes: 0
Reputation: 10403
You need to match only the part you want to remove with re.sub
and replace it by an empty string:
import re
string = '"name":"S. tuberosum subsp. andigenum (ADG) 2-1-2-2"'
print(re.sub('(\s(\d-)*\d)', '', string))
Output
"name":"S. tuberosum subsp. andigenum (ADG)"
Upvotes: 1