Reputation: 2303
I'm having an issue with base64 images that are not converting correctly sometimes. I need a way to test if the image is in correct base64 format before converting it so I can try to look further into the problem. I have found some regex formulas online, but I think they only expect the string without the headers. I have the string with the headers. I tried to add the headers, but it keeps breaking.
The original:
^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
The I added the headers but it doesn't work:
^([data:image/png;base64,][A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
Thank you
Upvotes: 1
Views: 2731
Reputation: 5500
You may notice in the original regex the use of [square brackets]
, these create character sets matching any character within so [data:image/png;base64,]
will match d,a,t,a,....,6,4,
,
. Instead, you may want to create a non-capturing group because I think you're trying to make the header optional, like this (?:data:image/png;base64,)?
^((?:data:image/png;base64,)?[A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
^ # Anchors to the beginning to the string.
( # Opens CG1
(?:data:image/png;base64, # Opens NCG1
# Literal data:image/png;base64,
)? # Closes NCG1
# ? repeats zero or one times
[A-Za-z0-9+/] # Character class (any of the characters within)
# Anything between A and Z
# Anything between a and z
# Anything between 0 and 9
# Any of: +/
{4} # Repeats 4 times.
)* # Closes CG1
# * repeats zero or more times
( # Opens CG2
[A-Za-z0-9+/] # Character class (any of the characters within)
# Anything between A and Z
# Anything between a and z
# Anything between 0 and 9
# Any of: +/
{4} # Repeats 4 times.
| # Alt (CG2)
[A-Za-z0-9+/] # Character class (any of the characters within)
# Anything between A and Z
# Anything between a and z
# Anything between 0 and 9
# Any of: +/
{3} # Repeats 3 times.
= # Literal =
| # Alt (CG2)
[A-Za-z0-9+/] # Character class (any of the characters within)
# Anything between A and Z
# Anything between a and z
# Anything between 0 and 9
# Any of: +/
{2} # Repeats 2 times.
== # Literal ==
) # Closes CG2
$ # Anchors to the end to the string.
If, however, you want to require the headers, you can remove the non-capturing group and the ?
quantifier altogether.
^(data:image/png;base64,[A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
Upvotes: 3
Reputation: 49086
The regular expression
^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
What does all those characters mean:
^
... find a string which starts at beginning of a line or string buffer.
(
... )
... define a marking group for back referencing the string found by the expression inside the parentheses or for applying a multiplier like used here. Grouping an expression just for applying a multiplier is usually better than with a non marking group, i.e. with (?:
... )
where the question mark and the colon immediately after opening parenthesis makes the group a non marking group.
[
... ]
... define a positive class of characters which means that any of the characters within the square brackets should be found once for a positive match. [^
... ]
would be a negative character class definition which means any character except one of the characters in the square brackets should be found.
[A-Za-z0-9+/]
... a character being either an upper case or a lower case letter from ASCII table or a digit or the plus sign or a slash.
{4}
... is a multiplier and means previous expression or character exactly four times.
*
... is also a multiplier and means previous expression or character 0 or more times.
|
... means OR.
$
... means end of line without matching line terminator or end of string buffer.
So this expression means:
To allow additionally at beginning of line or string buffer optionally a header string, the expression should be modified to:
^(?:data:image/png;base64,)?(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
The question mark after the non marking group(?:data:image/png;base64,)
means here the previous expression (just a fixed string) zero or one times.
As you can see I changed also the 2 marking groups into 2 non marking groups by inserting ?:
after the opening parentheses.
Upvotes: 2