Reputation: 193
I am very new to C and trying to implement a copy_buffer
function to allocate a new buffer, copy the contents from an existing buffer into the new buffer, and return the new buffer.
I am trying to use malloc()
and memset()
, and I understand I need to malloc
twice: one for the struct and one for the data. Do I need to memset
twice too? Please point out where I wrote wrong for copy_buffer
function.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// Structure to hold a buffer of binary data
// Note that we expect that it could hold null(0) characters,
// so we can't rely on strlen to tell us the size or strcpy to copy the data
typedef struct BufferStruct {
char *data;
int size;
} Buffer;
Buffer *copy_buffer(Buffer *buffer) {
Buffer *new_buffer = (Buffer*)malloc(sizeof(Buffer));
memset(new_buffer, 0, sizeof(Buffer));
new_buffer->data = malloc(sizeof(char));
memset(new_buffer->data, 0, sizeof(char));
for (int i = 0; i < buffer->size; ++i) {
new_buffer->data[i] = buffer->data[i];
}
new_buffer->size = buffer->size;
return new_buffer;
}
Buffer *example1() {
Buffer *buffer = (Buffer*)malloc(sizeof(Buffer));
buffer->data = "hello world\nthis is a string";
buffer->size = strlen(buffer->data);
return buffer;
}
// Example buffer storing 3 totally different strings in the same buffer (note the '\0')
Buffer *example2() {
Buffer *buffer = (Buffer*)malloc(sizeof(Buffer));
buffer->data = "this string has null\0characters\0 in the middle, beware!";
buffer->size = 55;
return buffer;
}
//
// Use fopen to create a file for writing
// Then fwrite to write the buffer to a file
//
void write_buffer(const char *filename, Buffer *buffer) {
FILE *file = fopen(filename, "w");
fwrite(buffer->data, 1, buffer->size, file);
fclose(file);
}
int main() {
Buffer *example = example2();
Buffer *copied = copy_buffer(example);
write_buffer("example1.bin", copied);
return 0;
}
Upvotes: 6
Views: 18754
Reputation: 744
Do I need to memset twice too?
Technically no, provided you properly initialize all elements of the Buffer
struct before using them.
I feel this is a risky habit, however. It's very difficult to be consistent, and in some contexts the program can crash if you make a mistake. Many developers seem to recommend initializing or zero-initializing variables, particularly data buffers, but opinions (and circumstances) vary.
You can use calloc
with the same effect as malloc
and memset
:
Buffer *buffer = calloc(1, sizeof(Buffer));
This allocates and zero-initializes memory.
As @Chris Rollins pointed out,
new_buffer->data = malloc(sizeof(char));
only allocates a single byte of storage because sizeof(char)
is 1. When the program copies data from from one buffer to the other, it starts overwriting unknown memory on the heap!
for (int i = 0; i < buffer->size; ++i) {
// new_buffer only has space for one byte,
// but we can copy several!
new_buffer->data[i] = buffer->data[i]; // BOOM
}
This is a buffer overflow. It is serious flaw and can/will crash the program. You aren't alone; similar mistakes have cause many security bugs.
Runtime memory analysis can catch bugs like this. If you are compiling with Clang or gcc, the address sanitizer (often called ASAN) will point out the specific error.
Compile with something like this:
gcc main.c -o main -fsanitize=address -fno-omit-frame-pointer -g
and you'll get an error message that points out the faulty lines of code:
=================================================================
==28677==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000efb1 at pc 0x000000400d29 bp 0x7fffa8a5db00 sp 0x7fffa8a5daf0
WRITE of size 1 at 0x60200000efb1 thread T0
#0 0x400d28 in copy_buffer main.c:20
#1 0x400f5d in main main.c:52
#2 0x7ff7256cf82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#3 0x400a88 in _start (main+0x400a88)
0x60200000efb1 is located 0 bytes to the right of 1-byte region [0x60200000efb0,0x60200000efb1)
allocated by thread T0 here:
#0 0x7ff725b11602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
#1 0x400c09 in copy_buffer main.c:17
#2 0x400f5d in main main.c:52
#3 0x7ff7256cf82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
Dense, but worth deciphering:
heap-buffer-overflow on address
in copy_buffer main.c:20
in main main.c:52
in copy_buffer main.c:17
in main main.c:52
The line numbers are specific to my .c
file and may vary on your machine.
memcpy
As @Steve Summit pointed out, you can replace the for
loop:
for (int i = 0; i < buffer->size; ++i) {
new_buffer->data[i] = buffer->data[i];
}
with a single call to memcpy
:
memcpy(new_buffer->data, buffer->data, buffer->size);
this has the same risks as the for
loop but is more concise/easier to read. It should be just as fast, too.
Incorporating these ideas into copy_buffer
could look something like this:
Buffer *copy_buffer(Buffer *buffer) {
Buffer *new_buffer = calloc(1, sizeof(Buffer));
new_buffer->data = calloc(buffer->size, sizeof(char));
memcpy(new_buffer->data, buffer->data, buffer->size);
new_buffer->size = buffer->size;
return new_buffer;
}
calloc
allocates and initializes memory on the heapnew_buffer->data
is large enough to hold all of buffer
's data
memcpy()
concisely copies data from one buffer to anotherOf course, don't forget to free
the memory that you allocate!
Upvotes: 8