shinzou
shinzou

Reputation: 6192

Flipping a file with a constant amount of use of file positioning functions

Write a function that flips all the bits in a file such that the last bit will be the first, and the first bit will the last and so on.

Note: only a constant amount of use of file positioning functions is allowed (such as fseek), i.e, the use of file positioning functions isn't dependent on the size of the file.

We have to flip each byte and each byte in the file. The function that flips a byte is pretty trivial, but I can't think of a way of flipping the bytes in the file without use of fseek. I can place the whole file a an array but that doesn't seem right, if the file is huge then it won't work.

Any hints or ideas please?

Upvotes: 0

Views: 62

Answers (3)

axiac
axiac

Reputation: 72216

Update: this answer does not use a constant number of repositioning in the output file.


This is a solution inspired by @Renzo's answer. In fact, it is a simplified version of that answer.

Using the same notations, let's say the input file is F1 and its size is B * M where B is the size of two buffers (b1 and b2) we have in memory and F2 is the output file.

The idea is simple: read a block of B bytes into buffer b1, flip its bits into b2 (can also flip them in place, if you like it more this way), write data from buffer b2 into F2 as many times it needs until it gets written in its final position in F2. Rewind F2 and repeat until all the blocks from F1 are processed. On each iteration, the content of buffer b2 is written to F2 one time less than on the previous iteration.

The (pseudo-)code

// Open the files
fopen(F1);
fopen(F2);

// Process the input in blocks of size B
for (i = 0; i < M; i ++) {
    // Read the next block of B bytes from F1 into b1
    fread(b1, B, F1);
    // Flip the bits from b1 into b2
    flip_the_bits(b1, b2);
    // Rewind F2, write b2 in it several times
    rewind(F2);
    // Write b2 in F2 once for each block not written yet
    // This overwrites the blocks written on the previous iteration
    // except for the last one
    for (j = i; j < M; j ++) {
        fwrite(b2, B, 1, F2);
    }
}
// Close the files
fclose(F1);
fclose(F2);

Remarks about the code

If the file size is not a multiple of B then M is rounded up to the nearest integer and the iteration when i == 0 and j == M-1 must take care to write only the needed bytes into the output file (only the tail of the b2 buffer).

There is no positioning in the input file (it is open, read head-to-tail and closed) but there are a lot of rewinds (M) on the output file and also a lot of data is rewritten. There are M*(M+1)/2 writes for M useful blocks of data. Rewinding is a repositioning.

In the extreme case when B == 1 and M == filesize(F1) the code has maximum efficiency on the used memory (two bytes for the two buffers or a single byte if the flip is done in place) but the worst performance on time (positioning and writes). More memory (as much as possible) makes it run as fast as possible.

A little history

The problem is probably 40 or 50 years old. The storage devices at that time were magnetic tapes and repositioning on magnetic tapes takes time. This is why the problem asks to use as few repositioning as possible. And this is why the C function that positions the pointer at the beginning of the file is named rewind().

As @Joachim notes in a comment on the question, we need to use fseek() once to go to the end of the file to find its size and another one to go back to the beginning of the file and start processing it.

Maybe this was the way to go back then when the magnetic tapes were the cutting edge storage technology, I don't know. The modern operating systems provide functions to find the size of a file without needing to fseek() to its end and back.

Upvotes: 2

Renzo
Renzo

Reputation: 27424

Suppose you have two buffers of size B bytes, and your files is of M*B bytes length. Then, if you can use an auxiliary file, you could use the following algorithm, given in pseudo-code, to solve your problem with a number of fseek equal to 0. Let's call F1 the original file, and F2 the auxiliary file, at each iteration the role of the two files is swapped:

for (i = 0; i<M; i++) {
  read a block of F1 in buffer1 and flip its bits
  for (j = 0; j < (M-i-1); j++) {
     read a block from F1 in buffer2
     write buffer2 to F2
  }
  write buffer1 to F2
  for (j = (M-i-1); j < M-1; j++) {
    read a block from F1 in buffer2
    write buffer2 to F2
  }
  exchange the role of F1 and F2
}

The final result is in F1.

The algorithm in practice reverses the bits of each block, and writes it in the "correct" position in the auxiliary file. The number of passes is equal to the number of blocks, so one should have two buffers as large as the memory can afford.

Of course the algorithm is inefficient, since it rewrites the entire file M times, but respects the given assignment.

Upvotes: 2

datenwolf
datenwolf

Reputation: 162164

Unless there's some obscure nonstandard function to reverse the direction in which the file pointer advances, I see no way around of reading the whole thing into a large buffer and go from there.

Now in case you're not limited to use stdio and are running on a system with large address space (something with more than, say 48 bits, like any modern 64 bit architecture) you could use a memory mapping instead of reading into a buffer. Actually a better name would be address space mapping (but memory map is what stuck): Instead of reading the file into a large chunk of allocated memory, you can as well tell the operating system: "Okay, here I've got this file, and I want its contents to appear in the process' address space". The OS will do it, and accessing such an address space location will actually read from the file in-situ (no intermediary buffer involved). A file memory map can also be used for writing.

See mmap for POSIX systems and CreateFileMapping+MapViewOfFile for Windows.

I don't know if that would be considered cheating though. Also I'm eager to see the official solution.

Upvotes: 1

Related Questions