humblebeast
humblebeast

Reputation: 303

Read and Write within a file in C (double it)

I am trying to read a file, read in how many bytes it contains and then round it up to its nearest GB and then double the file size. However, is there is way to read the file and then some do all this stuff back into the same file?

Here is what I have so far, but it creates a new file with the new contents but I'm not sure if my logic is correct

Also, do you create a constant like BYTE with #define?

So far as a test case I just used byte as an int and make it equal to 50

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include <time.h>

// #define BYTE 50

int main()
{
    FILE *fp1, *fp2;
    int ch1;
    clock_t elapsed;
    char fname1[40], fname2[40];
    char a;

    printf("Enter name of the file:");
    fgets(fname1, 40, stdin);
    while ( fname1[strlen(fname1) - 1] == '\n')
    {
        fname1[strlen(fname1) -1] = '\0';
    }

    fp1 = fopen(fname1, "r");
    if ( fp1 == NULL )
    {
        printf("Cannot open %s for reading\n", fname1 );
        exit(1);
    }

    printf("This program will round up the current file into highest GB, and then double it");

    elapsed = clock(); // get starting time

    ch1  =  getc(fp1); // read a value from each file

    int num = 50;

    int bytes = 0;

    while(1) // keep reading while values are equal or not equal; only end if it reaches the end of one of the files
    {
        ch1 = getc(fp1);

        bytes++;

        if (ch1 == EOF) // if either file reaches the end, then its over!
        {
            break; // if either value is EOF
        }
    }

    // 1,000,000,000 bytes in a GB 
    int nextInt = bytes%num;

    // example: 2.0GB 2,000,000,000 - 1.3GB 1,300,000,000 = 7,000,000,000 OR same thing as 2,000,000,000%1,300,000,000 = 700,000,000

    int counter = 0;

    printf("Enter name of the file you would like to create:");
    fgets(fname2, 40, stdin);
    while ( fname2[strlen(fname2) - 1] == '\n')
    {
        fname2[strlen(fname2) -1] = '\0';
    }

    fp2 = fopen(fname2, "w");
    if ( fp1 == NULL )
    {
        printf("Cannot open %s for reading\n", fname2);
        exit(1);
    }

    if(fp2 == NULL)
    {
     puts("Not able to open this file");
     fclose(fp1);
     exit(1);
    }

    while(counter != nextInt)
    {
     a = fgetc(fp1);
     fputc(a, fp2);
     counter++;
    }

    fclose(fp1); // close files
    fclose(fp2);

    printf("Total number of bytes in the file %u: ", bytes);
    printf("Round up the next GB %d: ", nextInt);

    elapsed = clock() - elapsed; // elapsed time
    printf("That took %.4f seconds\n", (float)elapsed/CLOCKS_PER_SEC);
    return 0;
}

Upvotes: 0

Views: 358

Answers (2)

Gene
Gene

Reputation: 47020

You're working way too hard. I'll assume your OS is Windows or Linux.

On Windows, _stat will get the exact length of a file. In Linux it's stat. Both will do this from file system information, so it's almost instantaneous.

On Windows, _chsize will extend the file to any number of bytes. On Linux it's ftruncate. The OS will be writing zeros to the extension, so it will be a fast write indeed.

In all cases it's simple to find the documentation by searching.

The code will be straight-line (no loops), about 10 lines.

Rounding up to the next GB is simply done with

#define GIGA ((size_t)1 << 30)
size_t new_size = (old_size + GIGA - 1) & ~(GIGA - 1);

Upvotes: 1

Jonathan Leffler
Jonathan Leffler

Reputation: 754880

You increment bytes before you check for EOF, so you have an off-by-one error.

However, reading a file byte by byte is a slow way of finding its size. Using standard C, you may be able to use ftell() — if you're on a 64-bit Unix-like machine. Otherwise, you're working too close to the values that will fit in 32-bit values. Using a plain int for bytes is going to run into trouble.

Alternatively, and better, you stat() or fstat() to get the exact size directly.

When it comes to doubling the size of the file, you could simply seek to the new end position and write a byte at that position. However, that does not allocate all the disk space (on a Unix machine); it will be a sparse file.

On rewrite, you need to know how your system will handle two open file streams on a single file. On Unix-like systems, you can open the original file once for reading and once for writing in append mode. You could then read large chunks (64 KiB, 256 KiB?) of data at a time from the read file descriptor and write that to the write descriptor. However, you need to keep track of how much data to write because the read won't encounter EOF.

Your code is going to write a lot of 0xFF bytes to the tail of the file on most systems (where EOF is recorded as -1).

Note that there are Gibibytes GiB (230 = 1,073,741,824 bytes) and Gigabytes GB (officially 109 = 1,000,000,000 bytes, but not infrequently used to mean GiB). See Wikipedia on Binary prefix, etc.

Upvotes: 1

Related Questions