Reputation: 1225
My function finds in string hex
notation (hexadecimal CSS
colors) and replaces with the short notation.
For example: #000000
can be represented as #000
import re
def to_short_hex (string):
match = re.findall(r'#[\w\d]{6}\b', string)
for i in match:
if not re.findall(r'#' + i[1] + '{6}', i):
match.pop(match.index(i))
for i in match:
string = string.replace(i, i[:-3])
return string;
to_short_hex('text #FFFFFF text #000000 #08088')
Out:
text #FFF text #000 #08088
Is there any way to optimize my code using list comprehension
etc..?
Upvotes: 1
Views: 1073
Reputation: 17920
This is what re.sub is for! It's not a great idea to use a regex to find something and then do a further sequence of search-and-replace operations to change it. For one thing, it's easy to accidentally replace things you didn't mean to, and for another it does a lot of redundant work.
Also, you might want to shorten '#aaccee' to '#ace'. This example does that too:
def to_short_hex(s):
def shorten_match(match):
hex_string = match.group(0)
if hex_string[1::2]==hex_string[2::2]:
return '#'+hex_string[1::2]
return hex_string
return re.sub(r"#[\da-fA-F]{6}\b", shorten_match, s)
re.sub
can take a function to apply to each match. It receives the match object and returns the string to substitute at that point.
Slice notation allows you to apply a stride. hex_string[1::2] takes every second character from the string, starting at index 1 and running to the end of the string. hex_string[2::2] takes every second character from the string, starting at index 2 and running to the end. So for the string "#aaccee", we get "ace" and "ace", which match. For the string "#123456", we get "135" and "246", which don't match.
Upvotes: 2
Reputation: 9172
How about this? You can speed it up embedding is6hexdigit
into to_short_hex
, but I wanted it to be more readable.
hexdigits = "0123456789abcdef"
def is6hexdigit(sub):
l = sub.lower()
return (l[0] in hexdigits) and (l.count(l[0]) == 6)
def to_short_hex(may_have_hexes):
replaced = ((sub[3:] if is6hexdigit(sub[:6]) else sub)
for sub in may_have_hexes.split('#'))
return '#'.join(replaced)
Upvotes: 3
Reputation: 151007
Using pop
on a list while iterating over it is always a bad idea. Hence this isn't an optimization, but a correction of a bug. Also, I edited the re
to prevent recognition of strings like '#34j342'
from being accepted:
>>> def to_short_hex(s):
... matches = re.findall(r'#[\dabcdefABCDEF]{6}\b', s)
... filtered = [m for m in matches if re.findall(r'#' + m[1] + '{6}', m)]
... for m in filtered:
... s = s.replace(m, m[:-3])
... return s
...
>>> to_short_hex('text #FFFFFF text #000000 #08088')
'text #FFF text #000 #08088'
Also, I think re.search
is sufficient in the second re
.
Upvotes: 1