Tiny
Tiny

Reputation: 27899

Replacing all dots in a string with backslashes in Java

The following code replaces all dots with backslashes in a fully qualified class name (it can be any string).

String str=Test.class.getName().replaceAll("\\.", "\\\\") + ".class";
System.out.println(str);

It requires four backslashes for the replacement string.


Assuming the replacement string is a separator character in file path, I want to make it independent on different operating systems using java.io.File.separator.

String separator=File.separator+File.separator
String str=Test.class.getName().replaceAll("\\.", separator) + ".class "
System.out.println(str);

In which case, it uses only two backslashes. Why doesn't it need four backslashes as in the previous case?

Upvotes: 1

Views: 6548

Answers (5)

A4L
A4L

Reputation: 17595

From the javadoc for Matcher.#appendReplacement

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

You need to quote the backslash in replacement, hence the four backslashes.

You could use Pattern#quote to make your method portable.

As pointed by @rolfl you have to use Matcher#quoteReplacement to quote the replacement string.

System.out.println(Test.class.getName().replaceAll("\\.",
    Matcher.quoteReplacement(File.separator)) + ".class ");

Upvotes: 1

rolfl
rolfl

Reputation: 17707

Java Strings are made of characters. To allow java programmers to enter strings as 'constants' and part of the Java code, the language allows you to type them in as characters surrounded by '"' quotes.....

 String str = "this is a string";

Some characters are hard to type in to the program, like a newline or tab character. Java introduces an escape mechanism to allow the programmer to enter these characters in to a String. The escape mechanism is the '\' backslash.

 String str = "this contains a tab\t and newline\n";

The problem is that now there is no easy way to enter a backslash, so to enter the backslash has to escape itself:

 String str = "this contains a backslash \\"

The next problem is that Regular Expressions are complicated things, and they also use the backslash \ as an escape character.

Now in, for example, perl, the regular expression \. would match the exact character '.' because in regular expressions the '.' is special, and needs to be escaped with a '\'. To capture that sequence \. in a Java program (as a string constant in the program) we will need to escape the '\' as \\ and our Java equivalent regular expression is \\.. Now, in perl, again, the regular expression to match the actual backslash character is \\. Similarly, we need to escape both of these in Java in the actual code, and it is \\\\.

So, the significance here is that the file-separator character in windows is the backslash \. This single character is stored in the field File.separator. If we want to type the same character in from a Java program, we would have to escape it as \\, but the '\' is already stored in the field, so we do not need to re-escape it for the Java program, but we DO need to escape it for the regular expression....

There are two ways to escape it for the regular expression. You can elect to add a backslash before it with:

"\\" + File.separator 

But this is a bad way to do it because it will not work on Unix (where the separator does not need to be escaped. It is even worse to do what you have done which is to double the file separator:

File.separator+File.separator

The right way to do it is to correctly escape the replacement side of the regular expression with Matcher.quoteReplacement(...)

System.out.println(Test.class.getName().replaceAll("\\.",
      Matcher.quoteReplacement(File.separator)) + ".class ")

Upvotes: 5

Sotirios Delimanolis
Sotirios Delimanolis

Reputation: 279970

Because File.separator is

public static final String separator = "" + separatorChar;
public static final char separatorChar = fs.getSeparator(); //gotten from system properties

where separatorChar is the system specific char for file separation. You don't need to escape anything in this case.

Why doesn't it need four backslashes as in the previous case?

Escaping is for String literals.

What fs.getSeparator() does is (in simple terms)

System.getProperty("file.separator");

which on Windows returns the String \. getSeperator() then takes the charAt(0) of that String, which is the char '\'. That gets converted to a String when concatenated with "" in

public static final String separator = "" + separatorChar;

This is done at runtime and therefore does not evaluate to a String literal and therefore needs no escaping.

Upvotes: 2

Eng.Fouad
Eng.Fouad

Reputation: 117597

You should use replace() as it receives plain text, while replaceAll() takes regular expression:

.replace(".", "\\");

Regarding the file separator character, you can use / as it can work on all operating systems in Java.

Upvotes: 1

PurkkaKoodari
PurkkaKoodari

Reputation: 6809

The four backslashes are used to encode the two backslashes used by the method. "\\\\" is interpreted as:

"\\" (an escaped backslash)
"\\" (another escaped backslash)

The 1st and 3rd backslashes are for escaping the 2nd and the 4th backslash in the string. If the backslashes are stored in a variable such as File.separator, they are not necessary.

For a more clear answer, try this code:

System.out.println("\\\\");

It prints \\.

Upvotes: 3

Related Questions