Binary File Read Performance C++

I'm trying to read large binary LAS file like this

struct format
{
    double X;
    double Y;
    double Z;
    short red;
    short green;
    short blue;
    short alpha;
    unsigned long intensity
    // etc.
}

std::ifstream stream;
Point3 GetPoint()
{
    format f;
    stream.seekg(offset);
    offset += sizeof(format);
    stream.Read((char *)f, sizeof(format));
    return Point3(f.X, f.Y, f.Z);
}

In main function:

Point3* points = new Point3[count]
for (int i = 0; i < count; i++)
    points[i] = GetPoint();

This operation takes about 116 seconds with 18million point records. But in a LAS Tool it takes nearly 15 seconds to read and start visualization of the same data.

How it can be fast 7 times then mine's? Is multithreading or something else? If my reading function is not well, how it can be worser 7 times then it?

I have some information about memory mapped files. It is very fast to load whole file to the memory but LAS files can be more then 15GBs which is oversized of my memory size so it will be load to virtual memory. Even I have enough memory I must read the memory mapped file with a loop too.

Can someone give me a help about this situation?

Upvotes: 2

Views: 1734

Answers (2)

agbinfo
agbinfo

Reputation: 803

Since the file is being read sequentially, why the call seekg? Try removing seekg.

Some other things you can try:

  • Read the file by blocks (32K) and pass these to another thread (look for consumer/producer pattern). The second thread (the consumer) can parse the blocks and fill the points array while the first thread (the producer) is waiting for I/O.
  • If Point3 defines a constructor, use a vector<> instead this way you won't have to create 'count' Point3 objects when you create the array.

Also, how do you know that the LAS tool waits for the entire file to be read before rendering? Is it possible that it starts the rendering before the file is completely read in?

Upvotes: 2

Twifty
Twifty

Reputation: 3378

Depending on your implementation ifstream is notoriously slow. On MS compilers, for example, it relies on <cstdio> for buffering. What this means is, it calls into 'c' functions for every byte to be read.

Also, are you sure you can just copy memory into your structure? Have you taken padding into account?

Like your question states, memory mapped files are a lot faster. You don't need to map the whole file, you can map a small part of it. Usually, mapping a part the same size as the systems page size is adequate.

look into mmap.

Upvotes: 1

Related Questions