sp00m
sp00m

Reputation: 48817

Reading a file content while it hasn't finished being copied/uploaded

Every 5 seconds (for example), a server checks if files have been added to a specific directory. If yes, it reads and processes them. The concerned files can be quite big (100+ Mo for example), so copying/uploading them to the said directory can be quite long.

What if the server tries to access a file that hasn't finished being copied/uploaded? How does JAVA manage these concurrent accesses? Does it depend on the OS of the server?


I made a try, copying a ~1300000-line TXT file (i.e. about 200 Mo) from a remote server to my local computer: it takes about 5 seconds. During this lapse, I run the following JAVA class:

public static void main(String[] args) throws Exception {

    String local = "C:\\large.txt";

    BufferedReader reader = new BufferedReader(new FileReader(local));
    int lines = 0;
    while (reader.readLine() != null)
        lines++;
    reader.close();

    System.out.println(lines + " lines");

}

I get the following exception:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:2882)
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
    at java.lang.StringBuffer.append(StringBuffer.java:306)
    at java.io.BufferedReader.readLine(BufferedReader.java:345)
    at java.io.BufferedReader.readLine(BufferedReader.java:362)
    at main.Main.main(Main.java:15)

When running the class once the file has finished being copied, I get the expected output (i.e. 1229761 lines), so the exception isn't due to the size of the file (as we could think in the first place). What is JAVA doing in background, that threw this OutOfMemoryError exception?

Upvotes: 3

Views: 682

Answers (2)

Andrey Taptunov
Andrey Taptunov

Reputation: 9495

How does JAVA manage these concurrent accesses? Does it depend on the OS of the server?

It depends on the specific OS. If you run a copy and server in a single JVM AsynchronousFileChannel (new in 1.7) class could be of a great help. However, if client and server are represented by different JVMs (or even more, are started on a different machines) it all turns to be a platform specific.

From JavaDoc for AsynchronousFileChannel:

As with FileChannel, the view of a file provided by an instance of this class is guaranteed to be consistent with other views of the same file provided by other instances in the same program. The view provided by an instance of this class may or may not, however, be consistent with the views seen by other concurrently-running programs due to caching performed by the underlying operating system and delays induced by network-filesystem protocols. This is true regardless of the language in which these other programs are written, and whether they are running on the same machine or on some other machine. The exact nature of any such inconsistencies are system-dependent and are therefore unspecified.

Upvotes: 1

JoeG
JoeG

Reputation: 7642

Why are you using a buffered reader just to count the lines?

From the javadoc: Reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.

This means it will "buffer", ie. save, that entire file in memory which causes your stack dump. Try a FileReader.

Upvotes: 1

Related Questions