Reputation: 34424
I have below content in text file
some texting content <img src="cid:part123" alt=""> <b> Test</b>
I read it from file and store it in String i.e inputString
expectedString = inputString.replaceAll("\\<img.*?cid:part123.*?>",
"NewContent");
I get expected output i.e
some texting content NewContent <b> Test</b>
Basically if there is end of line character in between img and src like below, it does not work for example below
<img
src="cid:part123" alt="">
Is there a way regex ignore end of line character in between while matching?
Upvotes: 6
Views: 9955
Reputation: 116538
By default, the .
character will not match newline characters. You can enable this behavior by specifying the Pattern.DOTALL
flag. In String.replaceAll()
, you do this by attaching a (?s)
to the front of your pattern:
expectedString = inputString.replaceAll("(?s)\\<img.*?cid:part123.*?>",
"NewContent");
See also Pattern.DOTALL with String.replaceAll
Upvotes: 3
Reputation: 213401
If you want your dot (.)
to match newline
also, you can use Pattern.DOTALL
flag. Alternativey, in case of String.replaceAll()
, you can add a (?s)
at the start of the pattern, which is equivalent to this flag.
From the Pattern.DOTALL
- JavaDoc : -
Dotall mode can also be enabled via the embedded flag expression (?s). (The s is a mnemonic for "single-line" mode, which is what this is called in Perl.)
So, you can modify your pattern like this: -
expectedStr = inputString.replaceAll("(?s)<img.*?cid:part123.*?>", "Content");
NOTE: - You don't need to escape your angular bracket(<)
.
Upvotes: 10
Reputation: 242786
You need to use Pattern.DOTALL
mode.
replaceAll()
doesn't take mode flags as a separate argument, but you can enable them in the expression as follows:
expectedString = inputString.replaceAll("(?s)\\<img.*?cid:part123.*?>", ...);
Note, however, that it's not a good idea to parse HTML with regular expressions. It would be better to use HTML parser instead.
Upvotes: 1