Reputation: 1226
Java's Files.lines method reads all lines from a file as a Stream, breaking the file into lines at the following delimiters:
\u000D followed by \u000A, CARRIAGE RETURN followed by LINE FEED
\u000A, LINE FEED
\u000D, CARRIAGE RETURN
I have files that contain the odd occurrence of \u000D, CARRIAGE RETURN
which I do not want to treat as a new line, to be consistent with the way that grep (Windows) doesn't treat just a single \u000D
as a newline marker. I want to process the lines in the file as a stream, but is there a way I can get a stream that doesn't use a single \u000D
as a newline marker, using just CR/LF or LF? I have to use Java 8.
My problem is that I am getting grep to return the line number with its matches, but because of the difference in EOL delimiters, Files.lines.skip(numLines)
doesn't then align with the same line if I try to skip to the line number returned by grep.
Upvotes: 0
Views: 466
Reputation: 718768
Lets assume that you are doing byte-wise input ...
A scalable / efficient solution avoids holding the entire file in memory, and / or creating a string object for each line of input that you skip. This is one way to do it.
File f = ...
InputStream is = new BufferedInputStream(new FileInputStream(f));
int lineCounter = 1;
int wantedLine = 42;
int b = 0;
while (lineCounter < wantedLine && b != -1) {
do {
b = is.read();
if (b == '\n') {
lineCount++;
}
} while (b != -1 && b != '\n');
}
if (lineCounter == wantedLine) {
// do stuff
}
Notes:
ByteBuffer
, but it makes the code more complicated. (If you are unfamiliar with the Buffer
APIs.)BufferedReader
.InputStream
resource.Upvotes: 1
Reputation:
Try this.
Stream.of(Files.readString(path).split("\r?\n"))
.filter(...
Upvotes: 0