Eyelash
Eyelash

Reputation: 1890

C++ using libarchive and archive_write_open_memory... how to clear the buffer?

I'm writing a server that will compress files and send them over an http socket.

Unfortunately, they're not really files, they're more like database entries from a remote source.

I want to compress each entry in memory and then send them out over my http server, and each entry is potentially large, like 1GB each.

I receive data from the source in chunks, for instance 16mb (but could be any chunk size that makes sense).

Conceptually, this is what is happening, although this is a little bit of pseudo-code:

archive *_archive = archive_write_new();

//set to zip format
bool ok = true;
ok |= archive_write_set_format( _archive, ARCHIVE_FORMAT_ZIP );
ok |= archive_write_add_filter( _archive, ARCHIVE_FILTER_NONE );

char *_archiveBuffer = malloc(8192);
size_t _used;
ok = archive_write_open_memory( _archive, _archiveBuffer, 8192, &_used );

if (!ok) return ERROR;

archive_entry *_archiveEntry = archive_entry_new();

//fetch metadata about the object by id
QString id = "123456789";
QJsonObject metadata = database.fetchMetadata(id);
int size = metadata["size"].toInt();

//write the http header
httpd.writeHeader(size);

archive_entry_set_pathname( _archiveEntry, "entries/"+id );
archive_entry_set_size( entry, size );
archive_entry_set_filetype( _archiveEntry, AE_IFREG );
//archive_entry_set_perm( entry, ... );

archive_write_header( _archive, _archiveEntry );

int chunksize = 16777216;
for (int w = 0; w < size; w+=chunksize)
{
    QByteArray chunk = database.fetchChunk(id,chunksize);
    archive_write_data( _archive, chunk.data(), (size_t) chunk.size() );

    //accumulate data, then fetch compressed data from _archiveBuffer and write to httpd
    if (_used > 0)
    {
        httpd.writeData(_archiveBuffer);
        //clear archive buffer?
    }
}

archive_entry_free(_archiveEntry);
archive_write_close(_archive);

httpd.writeData(_archiveBuffer);

archive_write_free(_archive);

The question is, how do I know when data has been compressed to _archiveBuffer, and when it has, how can I read the buffer and then clear it, resetting the _used counter. I assume if _used>0, a compress/flush has happened.

Also, does the _archiveBuffer need to be greater than my chunksize?

Seems like I may need to use a callback, but unclear how to use archive_write_open with a callback and a memory buffer.

I can't seem to find examples online.

Any help would be appreciated!

Upvotes: 2

Views: 1935

Answers (1)

Eyelash
Eyelash

Reputation: 1890

The solution was much easier than I thought... just took a minute to realize it.

I'm sure it's obvious to those familiar with the library and streams.

Using callbacks was the answer. Don't care about opening in memory, as that creates an extra layer which is not useful, as the library already manages itself.

Depending on how your multithreading is configured, the callback will execute when something interesting happens on the archive stream, so for instance you can write single bytes to the archive over and over, but only when it's saturated will a callback happen. At that moment you can write to network or wherever in the callback. So the void *client_data is key because that links back to your main classes and API.

In my case I didn't want to write an http header until (archive) data was available, because any error could happen when fetching, which may result in a different http header.

When data is done, the close and free functions will also do their work with callbacks, so destructors need to happen after those callbacks complete.

Now the task is to multithread these requests... which seems simple now that I get the library.

If anyone is interested, I can post new pseudo-code.

Upvotes: 1

Related Questions