Reputation: 41845

Should C++ programmer avoid memset?

I heard a saying that c++ programmers should avoid memset,

class ArrInit {
    //! int a[1024] = { 0 };
    int a[1024];
public:
    ArrInit() {  memset(a, 0, 1024 * sizeof(int)); }
};

so considering the code above,if you do not use memset,how could you make a[1..1024] filled with zero?Whats wrong with memset in C++?

thanks.

Upvotes: 44

Answers (11)

Gabriel Morin

Reputation: 802

As of C++ 11, the simplest way to fill an array with zeros is to zero-initialize it:

class ArrInit {
    int a[1024] = {}
};

To be precise, I believe this is actually aggregate initialization with an empty initializer list, which happens to zero-initialize every item in the array - but the main point is to show that it can be done. See https://en.cppreference.com/w/cpp/language/zero_initialization and https://en.cppreference.com/w/cpp/language/aggregate_initialization for more details.

You could also zero-initialize the container object itself, which might be more efficient if you have several members to zero out:

class ArrInit {
    int a[1024];
    int b[1024];
};

int main()
{
    ArrInit ai{};
}

See here for a live demo.

This said zero-initialization is a tricky beast - as the CPPReference page indicates, it has no dedicated syntax so you have to be careful that the compiler doesn't pick another type of initialization instead. And as the following question shows, getting compilers to zero-initialize your objects including padding can be tricky: Does C++ standard guarantee the initialization of padding bytes to zero for non-static aggregate objects?

With all this in mind, if you're writing high-performance code with a targeted set of platforms in mind (something common for instance in the video games industry), and you know the exact memory layout of the whole object you'll be overwriting with zeroes, memset can be a good tool if used carefully. It can enable things like reliable and fast comparison of POD objects with memcmp.

Upvotes: 1

user1920453

Reputation: 19

This is an OLD thread, but here's an interesting twist:

class myclass
{
  virtual void somefunc();
};

myclass onemyclass;

memset(&onemyclass,0,sizeof(myclass));

works PERFECTLY well!

However,

myclass *myptr;

myptr=&onemyclass;

memset(myptr,0,sizeof(myclass));

indeed sets the virtuals (i.e somefunc() above) to NULL.

Given that memset is drastically faster than setting to 0 each and every member in a large class, I've been doing the first memset above for ages and never had a problem.

So the really interesting question is how come it works? I suppose that the compiler actually starts to set the zero's BEYOND the virtual table... any idea?

Upvotes: 1

Kit10

Reputation: 1364

The short answer would be to use an std::vector with an initial size of 1024.

std::vector< int > a( 1024 ); // Uses the types default constructor, "T()".

The initial value of all elements of "a" would be 0, as the std::vector(size) constructor (as well as vector::resize) copies the value of the default constructor for all elements. For built-in types (a.k.a. intrinsic types, or PODs), you are guaranteed the initial value to be 0:

int x = int(); // x == 0

This would allow the type that "a" uses to change with minimal fuss, even to that of a class.

Most functions that take a void pointer (void*) as a parameter, such as memset, are not type safe. Ignoring an object's type, in this way, removes all C++ style semantics objects tend to rely on, such as construction, destruction and copying. memset makes assumptions about a class, which violates abstraction (not knowing or caring what is inside a class). While this violation isn't always immediately obvious, especially with intrinsic types, it can potentially lead to hard to locate bugs, especially as the code base grows and changes hands. If the type that is memset is a class with a vtable (virtual functions) it will also overwrite that data.

Upvotes: 0

AnT stands with Russia

Reputation: 320777

What's wrong with memset in C++ is mostly the same thing that's wrong with memset in C. memset fills memory region with physical zero-bit pattern, while in reality in virtually 100% of cases you need to fill an array with logical zero-values of corresponding type. In C language, memset is only guaranteed to properly initialize memory for integer types (and its validity for all integer types, as opposed to just char types, is a relatively recent guarantee added to C language specification). It is not guaranteed to properly set to zero any floating point values, it is not guaranteed to produce proper null-pointers.

Of course, the above might be seen as excessively pedantic, since the additional standards and conventions active on the given platform might (and most certainly will) extend the applicability of memset, but I would still suggest following the Occam's razor principle here: don't rely on any other standards and conventions unless you really really have to. C++ language (as well a C) offers several language-level features that let you safely initialize your aggregate objects with proper zero values of proper type. Other answers already mentioned these features.

Upvotes: 14

UncleBens

Reputation: 41351

Zero-initializing should look like this:

class ArrInit {
    int a[1024];
public:
    ArrInit(): a() { }
};

As to using memset, there are a couple of ways to make the usage more robust (as with all such functions): avoid hard-coding the array's size and type:

memset(a, 0, sizeof(a));

For extra compile-time checks it is also possible to make sure that a indeed is an array (so sizeof(a) would make sense):

template <class T, size_t N>
size_t array_bytes(const T (&)[N])  //accepts only real arrays
{
    return sizeof(T) * N;
}

ArrInit() { memset(a, 0, array_bytes(a)); }

But for non-character types, I'd imagine the only value you'd use it to fill with is 0, and zero-initialization should already be available in one way or another.

Upvotes: 24

Charles Eli Cheese

Reputation: 783

There's no real reason to not use it except for the few cases people pointed out that no one would use anyway, but there's no real benefit to using it either unless you are filling memguards or something.

Upvotes: 0

Adrian McCarthy

Reputation: 48038

In addition to badness when applied to classes, memset is also error prone. It's very easy to get the arguments out-of-order, or to forget the sizeof portion. The code will usually compile with these errors, and quietly do the wrong thing. The symptom of the bug might not manifest until much later, making it difficult to track down.

memset is also problematic with lots of plain types, like pointers and floating point. Some programmers set all bytes to 0, assuming the pointers will then be NULL and floats will be 0.0. That's not a portable assumption.

Upvotes: 1

Charles Salvia

Reputation: 53339

In C++ std::fill or std::fill_n may be a better choice, because it is generic and therefore can operate on objects as well as PODs. However, memset operates on a raw sequence of bytes, and should therefore never be used to initialize non-PODs. Regardless, optimized implementations of std::fill may internally use specialization to call memset if the type is a POD.

Upvotes: 52

rui

Reputation: 11284

Your code is fine. I thought the only time in C++ where memset is dangerous is when you do something along the lines of:
YourClass instance; memset(&instance, 0, sizeof(YourClass);.

I believe it might zero out internal data in your instance that the compiler created.

Upvotes: 0

jcoder

Reputation: 30055

It is "bad" because you are not implementing your intent.

Your intent is to set each value in the array to zero and what you have programmed is setting an area of raw memory to zero. Yes, the two things have the same effect but it's clearer to simply write code to zero each element.

Also, it's likely no more efficient.

class ArrInit
{
public:
    ArrInit();
private:
    int a[1024];
};

ArrInit::ArrInit()
{
    for(int i = 0; i < 1024; ++i) {
        a[i] = 0;
    }
}


int main()
{
    ArrInit a;
}

Compiling this with visual c++ 2008 32 bit with optimisations turned on compiles the loop to -

; Line 12
    xor eax, eax
    mov ecx, 1024               ; 00000400H
    mov edi, edx
    rep stosd

Which is pretty much exactly what the memset would likely compile to anyway. But if you use memset there is no scope for the compiler to perform further optimisations, whereas by writing your intent it's possible that the compiler could perform further optimisations, for example noticing that each element is later set to something else before it is used so the initialisation can be optimised out, which it likely couldn't do nearly as easily if you had used memset.

Upvotes: 8

anon

Reputation:

The issue is not so much using memset() on the built-in types, it is using them on class (aka non-POD) types. Doing so will almost always do the wrong thing and frequently do the fatal thing - it may, for example, trample over a virtual function table pointer.

Upvotes: 51

Should C++ programmer avoid memset?

Answers (11)

Related Questions