Reputation: 85
The following def clean_sheet_title
function references INVALID_TITLE_CHAR
and INVALID_TITLE_CHAR_MAP
to strip out invalid characters and limits the title
to 31 characters -
# This strips characters that are invalid to Excel
INVALID_TITLE_CHARS = ["]", "[", "*", ":", "?", "/", "\\", "'"]
INVALID_TITLE_CHAR_MAP = {ord(x): "" for x in INVALID_TITLE_CHARS}
# How would I remove strings, as well as the characters from INVALID_TITLE_CHARS?
INVALID_TITLE_NAMES = ["zz_ FeeRelationship", " Family"]
def clean_sheet_title(title):
title = title or ""
title = title.strip()
title = title.translate(INVALID_TITLE_CHAR_MAP)
return title[:31]
My question is how I would expand this to also remove strings from within the INVALID_TITLE_NAMES
list?
What I've tried: I have tried making the following update to def clean_sheet_title
however this makes no difference to title
-
INVALID_TITLE_CHARS = ["]", "[", "*", ":", "?", "/", "\\", "'"]
INVALID_TITLE_CHAR_MAP = {ord(x): "" for x in INVALID_TITLE_CHARS}
INVALID_TITLE_NAMES = ["zz_ FeeRelationship", "Family"]
def clean_sheet_title(title):
title = title or ""
title = title.strip()
title = title.translate(INVALID_TITLE_CHAR_MAP, "")
for name in INVALID_TITLE_NAMES:
title = title.replace(name, "")
return title[:31]
Examples:
Current function ability - if title
== Courtenay:Family
then currently the def clean_sheet_title
will ensure the title will be Courtenay Family
.
Desired function ability - Sometimes title
can be prefixed or suffixed with either zz_ FeeRelationship
or Family
, in both cases, these strings should be dropped. E.g. zz_ FeeRelationship Courtenay:Family
would become Courtenay
Upvotes: 0
Views: 71
Reputation: 25490
You could use regular expressions to match any of your keywords or characters and replace them with an empty string:
import re
INVALID_TITLE_CHARS = ["]", "[", "*", ":", "?", "/", "\\", "'"]
INVALID_TITLE_NAMES = ["zz_ FeeRelationship", " Family"]
inv_char_grp = re.escape("".join(INVALID_TITLE_CHARS))
inv_name_grp = "|".join(re.escape(name) for name in INVALID_TITLE_NAMES)
regex = f"[{inv_char_grp}]|{inv_name_grp}"
title = "zz_ FeeRelationship Courtenay: Family"
result = re.sub(regex, "", title)
print(result)
which prints Courtenay
An explanation of the regular expressions:
INVALID_TITLE_CHARS
, they need to be escaped so that the regex engine recognizes them as literal characters instead of using their special meaning. So we join all the characters in INVALID_TITLE_CHARS
, then use re.escape
to escape the resulting string. This gives us the regex inv_char_grp = r"\]\[\*:\?/\\'"
[
and ]
to denote that we want to match one of any of those characters using `f"[{inv_char_grp}]".INVALID_TITLE_NAMES
. Since these are whole strings, we won't use a character group for them. Instead, we can use the |
operator to indicate that we want to match any of its operands. Also remember to escape the names in case they contain any special characters.The final regex we get is
[\]\[\*:\?/\\']|zz_\ FeeRelationship|\ Family
[\]\[\*:\?/\\'] : Any of these chars ][*:?/\
| : Or
zz_\ FeeRelationship : Exactly zz_, then a space, then FeeRelationship
| : Or
\ Family : Exactly one space, then Family
Upvotes: 0
Reputation: 286
Try this:
for name in INVALID_TITLE_NAMES:
title = title.replace(name, "")
Is that the result you are trying to achieve? It should replace each invalid name in title
with an empty string.
Upvotes: 0