user1979760
user1979760

Reputation: 43

Reading & writing Files - Java empty spaces

I want to read from a file and write to a file. The input file is as follows

<ORLANDO>   <0%>
    As I remember, Adam, it was upon this fashion bequeathed me by will but poor a thousand crowns, and, as thou sayest,
<ORLANDO>

"A s   I   r e m e m b e r    A d a m    i t   w a s   u p o n   t h i s   f a s h i o n   b e q u e a t h e d   m e   b y   w i l l   b u t   p o o r   a   t h o u s a n d   c r o w n s    a n d    a s   t h o u   s a y e s t    c h a r g e d   m y   b r o t h e r   o n  ..."

I have written a java program to remove lines with tags and also replace any punctuation with spaces. But each letter that is written out has a space in between and also in between lines lots of blank lines are present. How to remove them? . Please help.

String line=null;
    try {
        BufferedReader br=new BufferedReader( new FileReader("filename"));
        PrintWriter writer = new PrintWriter(new FileWriter("filename"));
    try {
            while((line=br.readLine())!=null)
            {

                if(!line.contains("<"))
                {
                    line=(line.replaceAll("\\p{Punct}",""));

                    writer.println(line);
                    writer.flush();


                 }
            }
}

Upvotes: 0

Views: 2330

Answers (2)

Peter Lawrey
Peter Lawrey

Reputation: 533530

When you open a file with PrintWriter by default it truncates the file. You can set it to append instead, but either way you cannot rewrite a file you are reading from this way.

Instead you should create a new file and write to that. When you have finished you can delete the original and rename the copy (or delete the copy if it is exactly the same)

But each letter that is written out has a space in between and also in between lines lots of blank lines are present.

This would happen for you wrote UTF-16 but read it as ASCII or UTF-8. The way to avoid this is to not use UTF-16 which is not the default.

try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("filename"), StandardCharsets.UTF_8));
     PrintWriter pw = new PrintWriter(new OutputStreamWriter(new FileOutputStream("filename.tmp"), StandardCharsets.UTF_8))) {
    for(String line; (line = br.readLine())!=null;) {
        pw.println(line.replaceAll("<[^>]+>", ""));
    }
}

Upvotes: 5

Uwe Plonus
Uwe Plonus

Reputation: 9954

Are you opening the written file with the correct encoding. It looks like you write UTF-8 and open it with ASCII or any ISO-8859 encoding.

Upvotes: 0

Related Questions