Reputation: 9098
I am making use of dk.brics.automaton tool. I am using a file in which I have regular expressions . I want to compute DFA states of all those regular expressions. eg I have this RE "/^\x3c(REQIMG|RVWCFG)\x3e/ism"
which I take it in string array retval[0]. The code works great but the problem is when I am not using any file for reading and just pass this above RE in the function RegExp then it says invalid escape.So, when I write the RE as
"/^\\x3c(REQIMG|RVWCFG)\\x3e/ism"
then it doesn't give me error.
I am not getting it why am I not getting this invalid escape when I am reading RE from file
String line = null;
String retval[];
int j=0;
try {
FileReader fileReader =new FileReader(fileName);
BufferedReader bufferedReader =new BufferedReader(fileReader);
while((line = bufferedReader.readLine()) != null) {
retval= line.split("\t");
int i=0;
for(i=0;i<retval.length;i++){
try{
j=j+1;
RegExp r = new RegExp(retval[i],ALL);
Automaton a = r.toAutomaton();
System.out.println("RE : "+retval[i]);
System.out.println("States"+a.getNumberOfStates());
}
catch(Exception ex)
{
}
}
}
Upvotes: 0
Views: 239
Reputation: 9019
In Java, the backslash \
inside a literal string denotes an escape sequence. Hence it has special meaning to the compiler. That is why, in order to tell the compiler that you need backslash to mean an actual backslash (and not an "escape sequencer"), you need to escape it explicitly again with another backslash.
Why the error in the literal string?
In your example, when the compiler encounters ..\x...
it treats the first occurrence of backslash \
as an escape sequencer but then finds x
which does not form a valid escape sequence. Hence the error. That's why you have to escape the \
when used in literal strings like this: \\x
: "/^\\x3c(REQIMG|RVWCFG)\\x3e/ism"
Why no error when reading from file?
However, while reading from a file, you are not dealing with literals, hence the string stored in the variable does not need any "escape"ing as the compiler can easily figure that all data in the file is to be treated as verbatim strings. Hence you do not have to escape backslashes here and keep the regex as it is meant to be: /^\x3c(REQIMG|RVWCFG)\x3e/ism
Side Note:
Unfortunately Java does not have verbatim string literals (yet) like .NET has. For example, in .NET, you could make the string verbatim like below and also this.
RegExp(@"/^\x3c(REQIMG|RVWCFG)\x3e/ism",...)
Upvotes: 3
Reputation: 37813
"\x"
is an invalid escape sequence. You have to escape the backslash: "\\x"
.
The String literal "\\x"
represents the string containing \x
. You only have to escape it in the code, but when you read \x
from a file, there will be no problem.
Assume your file only contains the next line (no leading or trailing whitespace)
\x
and you read the contents of the file into a string:
String fileContent = readFileContent();
now
boolean equal = "\\x".equals(fileContent);
equal
will be true
.
Upvotes: 2