Kris Bob
Kris Bob

Reputation: 11

Random numbers external sort

I need to write a program that generates N random numbers and write them to binary file in descending order. It should be done without using any of sorting algorithms that use main memory. This is what I've done so far:

#include <iostream>
#include <fstream> 
#include <ctime>
#include <cstdlib>

using namespace std;
int main () {
  srand(time(0));
  rand();
  int N;
  do{
    cout << "Unesite N: ";
    cin >> N;
    } while(N<=0);

  ofstream br("broj.dat", ios::binary | ios::trunc);

  for(int i = 0; i<N; i++){
    int a = rand();
    br.write((char *)&a, sizeof(a));
  }
  br.close();

  return 0;
}

So, I've generated random numbers and wrote them to binary file but I don't know how to sort it.

Upvotes: 1

Views: 393

Answers (3)

Max Lybbert
Max Lybbert

Reputation: 20039

The standard library has a merge sort, but you need to use random access iterators. If you can use mmap (or its equivalent), you have random access iterators (yes, I know that you need to take COUNT from the command line):

#include <algorithm>
#include <cstdio>
#include <random>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>

const size_t COUNT = 4096 * 4096;

int main()
{
    // create file (using mmap for simplicity)
    int fd = open("out.dat", O_RDWR | O_TRUNC | O_CREAT, S_IRUSR | S_IWUSR);
    if (fd < 0) {
        std::perror("open failed");
        return 1;
    }
    if (ftruncate(fd, COUNT * sizeof(unsigned)) != 0) {
        std::perror("ftruncate failed");
        close(fd);
        return 1;
    }
    void* mm = mmap(nullptr, COUNT * sizeof(unsigned), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    if (mm == MAP_FAILED) {
        std::perror("mmap failed");
        close(fd);
        return 1;
    }
    close(fd);

    // populate file
    unsigned* begin = static_cast<unsigned*>(mm);
    std::default_random_engine rng((std::random_device())());
    std::generate_n(begin, COUNT, rng);
    msync(mm, COUNT * sizeof(unsigned), MS_SYNC);
    std::puts("file written");

    // sort file
    std::stable_sort(begin, begin + COUNT);
    msync(mm, COUNT * sizeof(unsigned), MS_SYNC);
    std::puts("file sorted");

    if (std::is_sorted(begin, begin + COUNT)) {
        std::puts("it's properly sorted");
    }

    // close file
    munmap(mm, COUNT * sizeof(unsigned));
    return 0;
}

The msync calls aren't actually needed. I'm honestly surprised that this has decent performance.

Upvotes: 0

Dave
Dave

Reputation: 9085

You can generate your numbers in sorted order in linear time. The paper describing how to do this is: Generating Sorted Lists of Random Numbers by Bentley & Saxe

https://pdfs.semanticscholar.org/2dbc/4e3f10b88832fcd5fb88d34b8fb0b0102000.pdf

/**
 * Generate an sorted list of random numbers sorted from 1 to 0, given the size
 * of the list being requested.
 * 
 * This is an implementation of an algorithm developed by Bentley and Sax, and
 * published in in ACM Transactions on Mathematical Software (v6, iss3, 1980) on
 * 'Generating Sorted Lists of Random Numbers'.
 */
public class SortedRandomDoubleGenerator {
    private long       valsFound;
    private double     curMax;
    private final long numVals;

    /**
     * Instantiate a generator of sorted random doubles.
     * 
     * @param numVals the size of the list of sorted random doubles to be
     *        generated
     */
    public SortedRandomDoubleGenerator(long numVals) {
        curMax = 1.0;
        valsFound = 0;
        this.numVals = numVals;
    }

    /**
     * @return the next random number, in descending order.
     */
    public double getNext() {
        curMax = curMax
                * Math.pow(Math.E, Math.log(RandomNumbers.nextDouble())
                        / (numVals - valsFound));
        valsFound++;
        return curMax;
    }
}

Upvotes: 6

btilly
btilly

Reputation: 46455

Here is pseudo-code for how I'd do it.

for i in 1..N:
    write rand() to new file
    push onto file stack (new file, size=1)
    while 2 < len(file stack) and size of top two files the same:
        pop top two and merge them
        push onto file stack (merged file, size=new size)

while 2 < len(file stack):
    pop top two and merge them
    push onto file stack (merged file, size=new size)

The top of the file stack is your new sorted file.

Upvotes: 0

Related Questions