Reputation: 625
I have text like
input_string = " - 01 APRIL 2018 - ING000038985695286069"
i want to replace date in the string with text like DD or DATE
output_string = "- DD/DATE - ING000038985695286069"
So far, i am able to extract date from the string using
import datefinder
matches = list(datefinder.find_dates(input_string))
if len(matches) > 0:
date = matches[0]
print(date)
But how to get my output is my Question.
Upvotes: 1
Views: 4527
Reputation: 848
The datefinder is cool for parsing the dates out of the text, but you can omit the library and just use regular expressions (if the dates are always in the shown format).
import re
result = re.sub('\s(\d*\s\w*\s\d*)\s', ' DATE ', input_string)
Regular expression breakdown:
\s
matches a space(
start capturing the text\d*
match any digit as many times as possible\s
match exactly one space character\w*
match as many word characters as possible (actually also matches numbers)\s
again one space\d*
again as many digits as possible)
end capturing\s
match one spaceUPDATE
The datefinder package can be used as follows to find all dates:
dates_regex = datefinder.DateFinder().DATE_REGEX
dates_regex.sub('DATE ', input_string)
Note that this solution still uses the package, but doesn't actually do what you expect it to. It finds number sequences and replaces them too.
I would strongly suggest you build your own regex to cover exactly your needs.
Upvotes: 1