yosmo78
yosmo78

Reputation: 619

Figure out number of bytes in input buffer

I am trying to write a program that reads from stdin, where a file is being redirected to stdin.

For example, my program is called scan, so the call on the command line will be:

./scan < file.txt

I want to allocate one big memory block for it, pointed to by a char*. I can't just take the file name as input, since it is a requirement that I have to deal with. I was wondering if it is possible to figure out the number of bytes sitting in the input buffer, so that I can do a bulk read of stdin all in one go.

So something like

char* read_all_stdin()
{
    size_t amt = num_of_bytes_in_stdin(); //how do this?
    char* file = (char*) malloc(amt+1);
    fread(file,1,amt,stdin); //idk if this is allowed either
    file[amt] = '\0';
    return file;
}

Upvotes: 1

Views: 1315

Answers (3)

phuclv
phuclv

Reputation: 41794

If the input is redirected from file then in Linux you can get the name of that file by reading /proc/self/fd/0 or /dev/fd/0

char filename[BUFSIZE];
int sz = readlink("/proc/self/fd/0", filename, BUFSIZE - 1);
filename[sz] = 0;
puts(filename);

On AIX you can read /proc/<PID>/fd/0 instead

On Windows you'll need to use GetFinalPathNameByHandle() like this

GetFinalPathNameByHandle(GetStdHandle(STD_INPUT_HANDLE), chPath, MAX_PATH, 0)

On BSD and macOS you'll use fcntl with F_GETPATH

#include <fcntl.h>

char filename[BUFSIZE];
fcntl(0, F_GETPATH, filename);
puts(filename);

It may not be possible in other platforms

If stdin is a pipe then obviously you can't know the size because the OS doesn't wait for the writing process to pump all of its data into the pipe before passing data to the consuming process

Upvotes: 1

Brendan
Brendan

Reputation: 37232

I was wondering if it is possible to figure out the number of bytes sitting in the input buffer, so that I can do a bulk read of stdin all in one go.

If you could determine the number of bytes in the input buffer, then it'd create an unavoidable race condition - new bytes/characters can be added to the input buffer after you've determined how many bytes there are but before you've used that value for anything.

The consequence of the unavoidable race condition is "No, in practice it is not possible to ensure that you can do a bulk read of stdin all in one go".

One alternative would be increase (double?) the size of the allocated memory whenever "fread()" says it filled the previously allocated memory and retry (e.g. using a loop and realloc()) until fread() couldn't fill the allocated memory. However, fread() is blocking (if you ask for 1024 bytes and there's only 10 bytes it will wait for the other 1014 bytes to arrive) so you'd have to fix that by changing stdin to non-blocking. Sadly this is platform specific (e.g. something like flags = fcntl(0, F_GETFL, 0); flags |= O_NONBLOCK; fcntl(0, F_SETFL, val); may work on Linux but not on Windows), so you end up with a a big complicated mess.

Upvotes: 1

Schwern
Schwern

Reputation: 164809

size_t amt = num_of_bytes_in_stdin(); //how do this?

You could maybe mess around with setvbuf, but AFAIK you can't. Stdin might not be buffered. The stream might contain more than one buffer full. Someone else might have changed how it's buffered. More might have been added between your checking, allocating, and reading.

The fundamental nature of I/O is you cannot know what or how much you're going to get.

Instead, allocate a single large buffer to read from, probably BUFSIZ. Reuse that buffer to read from the stream. Then copy from that to more appropriately sized memory.

Upvotes: 0

Related Questions