Java Regex search two file names

I need some help with Java regular expressions.
I have two files:

file_201111.txt.gz
file_2_201111.txt.gz

I need a regular expression to search both the files.

If I use file_[0-9]+.txt.gz I get the first file if I use file_[0-9]_[0-9]+.txt.gz I get the second file.

How can I combine both search patterns to search for the two files?

Thanks

Upvotes: 0

Views: 293

Answers (2)

mypetlion
mypetlion

Reputation: 2524

You have to indicate that the optional unit is optional with ?. And since it is a multi-character unit, you should group it with (). Try this:

file(_[0-9])?_[0-9]+\\.txt\\.gz

Upvotes: 2

ctwheels
ctwheels

Reputation: 22817

Brief

Since you haven't specified the actual format for all the files, I'll present you with a couple of regular expressions and you can use whichever best matches your needs.


Code

Method 1

This matches an arbitrary number of _ and digits.

See regex in use here

file[_\d]+\.txt\.gz

For all the haters, yes it will match file_.txt.gz, so to prevent that you can use file(?:_\d+)+\.txt\.gz instead.

Method 2

This matches one or two of the _number pattern where number represents any number (1+ digits).

See regex in use here: Both patterns below accomplish the same thing.

file(?:_\d+){1,2}\.txt\.gz
file_\d+(?:_\d+)?\.txt\.gz

Explanation

Method 1

  • file Match this literally
  • [_\d]+ Match one or more of any character in the set (_ or digit)
  • \.txt\.gz Match this literally (note that \. matches a literal dot character .)

Method 2

  • file Match this literally
  • (?:_\d+){1,2} Match _\d+ (underscore followed by one or more digits) once or twice
    • Note that the second option _\d+(?:_\d+)? is essentially the same.
  • \.txt\.gz Match this literally (note that \. matches a litearl dot character .)

Upvotes: 3

Related Questions