Swift-Tuttle
Swift-Tuttle

Reputation: 485

Matching contents of a String with a pattern

I am struggling a bit to create a regex matching pattern to be used with matches() method of String. My String value is something like -

3012145A_20348409-146139460.ABCDxyzPQr.1.1.xml

I am using the String.matches("regex") method but to be honest struggling to create the pattern which will match the String values like these. I did try a few different combos but in vain so for. Searched on the internet for some examples. The values are always going to be in similar format though the length might vary.

Any help is much appreciated.


There is more to matching just .xml
Well, apart from the example given there will be other values too in the List, so I need to match like

3012145A_20348409-146139460.ABCDxyzPQr.1.1.xml  

The list of values could be like -

3012145A_20348409-146139460.ABCDxyzPQr.1.1.xml
3012145_Error.xml
3012145_UK.pdf
3012145A_20348409.ABC.10.10.10.xml

I need the first value among these

(alphanum)(underscore)(num)(hyphen)(num)(dot)(aLpHa)(dot)(num)(dot)(num)(dot)(.xml)  

I tried this -

s.matches("[a-zA-Z0-9]_[0-9]-[0-9].[a-zA-Z].[0-9].[0-9].xml");

Upvotes: 0

Views: 785

Answers (2)

Swift-Tuttle
Swift-Tuttle

Reputation: 485

Brilliant!. Thanks a lot Favonius.
That worked perfectly.
So as I understand that what I was doing is even though I was giving a range [0-9a-zA-Z] it was actually trying to match only the first char, in my example, 3.
So in reality rather than 3012145A it was checking only whether 3 is part of my given range([0-9a-zA-Z]) and so forth for the entire String.
Your solution \w* will check whether that particular section is alphanumeric or \d* will check whether the section(bounded by the boundaries, say . or _) is within the whole range of numbers and/or alphabets.

So a very murkier way of matching 3012145A_ could be

[0-9][0-9][0-9][0-9][0-9][0-9][0-9][a-zA-Z]_

I am not proposing this solution just trying to understand the behavior and difference between [0-9] and \d*.

I still have a question though, the significance of (\\.)?\\., whats the purpose of this.

Thanks a lot again

Upvotes: 0

Favonius
Favonius

Reputation: 13974

Requirement :

(alphanum)(underscore)(num)(hyphen)(num)(dot)(aLpHa)(dot)(num)(dot)(num)(dot)(.xml)

Supposed regex:

\w*_\d*-\d*\.([a-zA-Z])*\.\d*\.\d*(\.)?\.xml

In java this will translate to:

Pattern p = Pattern.compile("\\w*_\\d*-\\d*\\.([a-zA-Z])*\\.\\d*\\.\\d*(\\.)?\\.xml",Pattern.CASE_INSENSITIVE);

Note

As I am using [a-zA-Z], you might not need Pattern.CASE_INSENSITIVE

Problem with your regex: s.matches("[a-zA-Z0-9]_[0-9]-[0-9].[a-zA-Z].[0-9].[0-9].xml");

You are looking for a single instance of either alpha, number or alphanumeric. Use * or + metacharacters.

Hope this help.

Upvotes: 3

Related Questions