Reputation: 33
I have a bunch of .txt files I am trying to read but for many of them they will not read. The ones that will not read appear to start with a blank line before the text. For example the following throws a NoSuchElementException:
public static void main(String[] args) throws FileNotFoundException{
Scanner input = new Scanner(new File("documentSets/med_doc_set/bmu409.shtml.txt"));
System.out.println(input.next());
}
where the text file being read begins with a blank line and then some text. I've also tried using input.skip("[\\s]*") to skip any leading whitespace but it throws the same error. Is there some way to fix this?
EDIT: The file hosted on google docs. If you download to view in a text editor you can see the empty line it starts with.
Upvotes: 1
Views: 2460
Reputation: 108879
The Scanner
type is weirdly inconsistent when it comes to handling input. It swallows I/O exceptions - consumers should test for these explicitly - so it is lax in informing readers of errors. But the type is strict when decoding character data - incorrectly encoded text or use of the wrong encoding will cause an IOException
to be raised, which the type promptly swallows.
This code reads all lines in a text file with error checking:
public static List<String> readAllLines(File file, Charset encoding)
throws IOException {
List<String> lines = new ArrayList<>();
try (Scanner scanner = new Scanner(file, encoding.name())) {
while (scanner.hasNextLine()) {
lines.add(scanner.nextLine());
}
if (scanner.ioException() != null) {
throw scanner.ioException();
}
}
return lines;
}
This code reads the lines and converts codepoints the decoder doesn't understand to question marks:
public static List<String> readAllLinesSloppy(File file, Charset encoding)
throws IOException {
List<String> lines = new ArrayList<>();
try (InputStream in = new FileInputStream(file);
Reader reader = new InputStreamReader(in, encoding);
Scanner scanner = new Scanner(reader)) {
while (scanner.hasNextLine()) {
lines.add(scanner.nextLine());
}
if (scanner.ioException() != null) {
throw scanner.ioException();
}
}
return lines;
}
Both these methods require you to provide the encoding explicitly rather than relying on the default encoding which is frequently not Unicode (see also the standard constants.)
Code is Java 7 syntax and is untested.
Upvotes: 3
Reputation: 3780
It starts with a blank line, and you're only printing the first line in your code, change it to:
public static void main(String[] args) throws FileNotFoundException{
Scanner input = new Scanner(new File("documentSets/med_doc_set/bmu409.shtml.txt"));
while(input.hasNextLine()){
System.out.println(input.nextLine());
}
}
Upvotes: 1
Reputation: 533500
Scanner reads all the words or numbers up to the end of the line. At this point you need to call nextLine(). If you want to avoid getting an Exception you need to call one of the hasNextXxxx() methods to determine if that type can be read.
Upvotes: 0