Maverick283
Maverick283

Reputation: 1402

BufferedReader messed up by different line seperators

I'm having a buffered reader streaming a file. There are two cases right now:

It is streaming a file generated on one PC, let's call it File1. It is streaming a file generated on another Computer, let's call it File2.

I'm assuming my problem is caused by the EOLs.

BufferedReader does read both files, but for the File2, it reads an extra empty line for every new line.

Also, when I compare the line using line.equalsIgnoreCase("abc"), given that the line is "abc" it does not return true.

Use this code together with the two files provided in the two links to replicate the problem:

public class JavaApplication {

/**
 * @param args the command line arguments
 */
public static void main(String[] args) throws IOException {
    File file = new File("C:/Users/User/Downloads/html (2).htm");
    BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8"));
    String line = "";

    while ((line = in.readLine()) != null) {
        System.out.println(line);
    }
}

File1, File2

Note how the second file prints an empty line after each line...

I've been searching and trying and searching and trying, and couldn't come up with a solution.

Any ideas how to fix that? (Especially the compare thing?)

Upvotes: 0

Views: 99

Answers (2)

Maverick283
Maverick283

Reputation: 1402

Joop was right, after some more research it seems like, even though both files have specified a UTF-16 encoding in their header, one was encoded in UTF-16, and the other (File1) in UTF-8. This lead to the "double line effect". Thanks for the effort that was put in answering this question.

Upvotes: 0

markspace
markspace

Reputation: 11030

Works for me.

public class CRTest
{
   static StringReader test = new StringReader( "Line 1\rLine 2\rLine 3\r" );
   public static void main(String[] args) throws IOException {
      BufferedReader buf = new BufferedReader( test );
      for( String line = null; (line = buf.readLine()) != null; )
         System.out.println( line );
   }
}

Prints:

run:
Line 1
Line 2
Line 3
BUILD SUCCESSFUL (total time: 1 second)

As Joop said, I think you've mixed up which file isn't working. Please use the above skeleton to create an MCVE and show us exactly what file input isn't working for you.


Since you appear to have a file with reversed \r\n lines, here's my first attempt at a fix. Please test it, I haven't tried it yet. You need to wrap your InputStreamReader with this class, then wrap the BufferedReader on the outside like normal.

class CRFix extends Reader
{

   private final Reader reader;
   private boolean readNL = false;

   public CRFix( Reader reader ) {
      this.reader = reader;
   }

   @Override
   public int read( char[] cbuf, int off, int len )
           throws IOException
   {
      for( int i = off; i < off+len; i++ ) {
         int c = reader.read();
         if( c == -1 )
            if( i == off ) return -1;
            else return i-off-1;
         if( c == '\r' && readNL ) { 
            readNL = false;
            c = reader.read();
         }
         if( c == '\n' ) 
            readNL = true;
         else 
            readNL = false;
         cbuf[i] = (char)c;
      }
      return len;
   }

   @Override
   public void close()
           throws IOException
   {
      reader.close();
   }

}

Upvotes: 1

Related Questions