CaiNiaoCoder
CaiNiaoCoder

Reputation: 3319

Can multiple threads write data into a file at the same time?

If you have ever used a p2p downloading software, they can download a file with multi-threading, and they created only one file, So I wonder how the threads write data into that file. Sequentially or in parallel?

Imagine that you want to dump a big database table to a file, and how to make this job faster?

Upvotes: 40

Views: 65729

Answers (5)

Peter Lawrey
Peter Lawrey

Reputation: 533510

You can use multiple threads writing a to a file e.g. a log file. but you have to co-ordinate your threads as @Thilo points out. Either you need to synchronize file access and only write whole record/lines, or you need to have a strategy for allocating regions of the file to different threads e.g. re-building a file with known offsets and sizes.

This is rarely done for performance reasons as most disk subsystems perform best when being written to sequentially and disk IO is the bottleneck. If CPU to create the record or line of text (or network IO) is the bottleneck it can help.

Image that you want to dump a big database table to a file, and how to make this job faster?

Writing it sequentially is likely to be the fastest.

Upvotes: 32

Padmakumar
Padmakumar

Reputation: 39

The synchronized declaration enables doing this. Try the below code which I use in a similar context.

package hrblib;

import java.io.*;

public class FileOp {

    static int nStatsCount = 0;

    static public String getContents(String sFileName) {  

        try {
            BufferedReader oReader = new BufferedReader(new FileReader(sFileName));
            String sLine, sContent = "";
            while ((sLine=oReader.readLine()) != null) {
                sContent += (sContent=="")?sLine: ("\r\n"+sLine);
            }
            oReader.close();
            return sContent;
        }
        catch (IOException oException) {
            throw new IllegalArgumentException("Invalid file path/File cannot be read: \n" + sFileName);
        }
    }
    static public void setContents(String sFileName, String sContent) {
        try {
            File oFile = new  File(sFileName);
            if (!oFile.exists()) {
                oFile.createNewFile();
            }
            if (oFile.canWrite()) {
                BufferedWriter oWriter = new BufferedWriter(new FileWriter(sFileName));
                oWriter.write (sContent);
                oWriter.close();
            }
        }
        catch (IOException oException) {
            throw new IllegalArgumentException("Invalid folder path/File cannot be written: \n" + sFileName);
        }
    }
    public static synchronized void appendContents(String sFileName, String sContent) {
        try {

            File oFile = new File(sFileName);
            if (!oFile.exists()) {
                oFile.createNewFile();
            }
            if (oFile.canWrite()) {
                BufferedWriter oWriter = new BufferedWriter(new FileWriter(sFileName, true));
                oWriter.write (sContent);
                oWriter.close();
            }

        }
        catch (IOException oException) {
            throw new IllegalArgumentException("Error appending/File cannot be written: \n" + sFileName);
        }
    }
}

Upvotes: 3

ern0
ern0

Reputation: 3172

What kind of file is this? Why do you need to feed it with more threads? It depends on the characteristics (I don't know better word for it) of the file usage.

Transferring a file from several places over network (short: Torrent-like)

If you are transferring an existing file, the program should

  • as soon, as it gets know the size of the file, create it with empty content: this prevents later out-of-disk error (if there's not enough space, it will turns out at the creation, before downloading anything of it), also it helps the the performance;
  • if you organize the transfer well (and why not), each thread will responsible for a distinct portion of the file, thus file writes will be distinct,
  • even if somehow two threads pick the same portion of the file, it will cause no error, because they write the same data for the same file positions.

Appending data blocks to a file (short: logging)

If the threads just appends fixed or various-lenght info to a file, you should use a common thread. It should use a relatively large write buffer, so it can serve client threads quick (just taking the strings), and flush it out optimal scheduling and block size. It should use dedicated disk or even computer.

Also, there can be several performance issues, that's why are there logging servers around, even expensive commercial ones.

Reading and writing random time, random position (short: database)

It requires complex design, with mutexes etc., I never done this kinda stuff, but I can imagine. Ask Oracle for some tricks :)

Upvotes: 1

WeMakeSoftware
WeMakeSoftware

Reputation: 9162

Java nio package was designed to allow this. Take a look for example at http://docs.oracle.com/javase/1.5.0/docs/api/java/nio/channels/FileChannel.html .

You can map several regions of one file to different buffers, each buffer can be filled separately by a separate thread.

Upvotes: 19

gkamal
gkamal

Reputation: 21000

You can have multiple threads write to the same file - but one at a time. All threads will need to enter a synchronized block before writing to the file.

In the P2P example - one way to implement it is to find the size of the file and create a empty file of that size. Each thread is downloading different sections of the file - when they need to write they will enter a synchronized block - move the file pointer using seek and write the contents of the buffer.

Upvotes: 1

Related Questions