Reputation: 86697
I'd like to use regex to validate and extract values from postal address having country iso letter code and zip in the following format:
DE-12345
So far I came up with: [a-zA-Z]{2}-\d+
Could I improve this?
Further question: what regex can I use to extract only
the two letters
the digits only ?
Upvotes: 0
Views: 1126
Reputation: 718768
Strictly speaking, ZIP codes are a post code / postal code system used within the United States of America.
Validating international post codes / postal codes is going to be tricky. Different countries use wildly different systems with different allowed characters, different numbers of characters and different "punctuation". Even the US postal system uses two forms of ZIP code; i.e. 5 digit, and 5 + 4 digit.
The Wikipedia page for postal codes lists the formats for a number of countries, but you may need to research further.
Upvotes: 0
Reputation: 1887
Ignoring the fact that every country has a completely different format.
to get the parts that match in java, surround them with brackets and select the group.
Pattern p = Pattern.compile("([a-zA-Z]{2})-(\d+)");
Matcher m = p.matcher("DE-123");
if (m.matches()) {
String letters = m.group(1);
String numbers = m.group(2);
}
Upvotes: 3
Reputation: 9591
Improvement depends on what the postal addresses are surrounded by... Example, if they are surrounded by a full page of text, it would make a difference what regex you use.
For the moment, your regex works perfectly fine.
The only thing I can think of, is that you could grab a list of all valid country codes, and do a huge alternation so that only the valid ones could be matched.
To extract the letters and numbers, you would wrap them in capture groups:
([a-zA-Z]{2})-(\d+)
The first brackets being Group 1, and the second brackets being Group 2.
Upvotes: 0