goalie35
goalie35

Reputation: 786

Regex: Need to validate barcode

I have the following barcode that I need to validate via regex:

TE1310 2000183B 804F58000020183B 20120509 0013.0002.0000 20161201

We're having an issue with our barcode scanners occassionally cutting off some characters from barcodes, so I need to validate it via the following regex rules:

  1. Starts with "TE1310"
  2. Space
  3. 2nd set of characters is exactly 8 length. Can contain numbers or letters
  4. Space
  5. 3rd set contains exacly 16 characters. Can be numbers or letters
  6. Space
  7. 4th set must be exactly "0013.0002.0000"
  8. Space
  9. 5th and final set contains 8 characters. Numeric only

I have the following regex & I'm pretty close but not sure how to do #7 above (0013.0002.0000). I placed "????" into my regex below where I'm unsure of how to do this part:

TE1310\s[A-Za-z0-9]{8}\s[A-Za-z0-9]{16}\s????\s\d{8}

Any idea how to do this? Thanks

Upvotes: 1

Views: 2846

Answers (2)

zzzzBov
zzzzBov

Reputation: 179046

I'm assuming a regular expression syntax similar to JavaScript, the basic ideas can be converted into any other regex that I know of.

1: Starts with TE1310

^TE1310

^ is used to match only at the beginning of a string, the characters that follow are matched literally.

2: Space

/^TE1310 /

I'm adding the / regex delimiters to show that there is in fact a space character contained within the regex. If your regex syntax supports alternative delimiters, you might see something along the lines of ~^TE1310 ~ instead.

3: 2nd set of characters is exactly 8 length. Can contain numbers or letters

/^TE1310 [a-zA-Z0-9]{8}/

[abc] is used to select a character in the provided set, the use of a-zA-Z0-9 is to match any letter (upper or lower case) or number.
{n} is used to repeat the previous selector n times.

4: Space

/^TE1310 [a-zA-Z0-9]{8} /

5: 3rd set contains exactly 16 characters. Can be numbers or letters

/^TE1310 [a-zA-Z0-9]{8} [a-zA-Z0-9]{16}/

6: Space

/^TE1310 [a-zA-Z0-9]{8} [a-zA-Z0-9]{16} /

7: 4th set must be exactly 0013.0002.0000

/^TE1310 [a-zA-Z0-9]{8} [a-zA-Z0-9]{16} 0013\.0002\.0000/

\. is used to escape the . which is a selector for any non-newline character. If you're building the Regex in a string, you may need to double escape the \ character, so it may be \\. instead of \.

8: Space

/^TE1310 [a-zA-Z0-9]{8} [a-zA-Z0-9]{16} 0013\.0002\.0000 /

9: 5th and final set contains 8 characters. Numeric only

/^TE1310 [a-zA-Z0-9]{8} [a-zA-Z0-9]{16} 0013\.0002\.0000 \d{8}/

\d matches numbers, it's equivalent to [0-9]. Similarly to \. you may need to double escape the \ character, which would be \\d instead.

10: End of string

You didn't mention it explicitly, but I assume the match should only match lines that exactly match this pattern, and aren't followed by trailing numbers/letters:

/^TE1310 [a-zA-Z0-9]{8} [a-zA-Z0-9]{16} 0013\.0002\.0000 \d{8}$/

$ is used to match the very end of the string.

Upvotes: 4

#7 is trivial, it should be simply 0013\.0002\.0000 you have to make sure to escape your periods, and escape your escape characters if that's what the language requires

So, try

TE1310\s[A-Za-z0-9]{8}\s[A-Za-z0-9]{16}\s0013\.0002\.0000\s\d{8}

assuming the rest of the points are correct, of course.

Also, as Sednus said, you might want to match the beginning and end of the string. the conventional symbols are ^ for beginning and $ for the end, but I'd check a reference for your particular language just in case.

If you don't do that, the regex will find any TE1310 2000183B 804F58000020183B 20120509 0013.0002.0000 20161201 in a larger string, such as

asgsdaTE1310 2000183B 804F58000020183B 20120509 0013.0002.0000 20161201qeasdfa

Upvotes: 2

Related Questions