user2138149
user2138149

Reputation: 16585

How is std::string(char* char_array) implemented?

I am interested to know how the string class implements copying from a character array for initialization of its contents.

My guess would be something like:

1: Find length of character array, N. (how is this done? a crude method would be to look at each character individually until the null character is found? is a better method used?)

2: Allocate N bytes of storage.

3: Use strcpy to copy each element byte by byte.

Obviously this is not a very complicated question, I was just interested to know whether the following are (essentially or approximately) equivalent:

std::string program_name(argv[0]);

and

std::string program_name;
int length = 0;
while(*(argv[0] + length) != '/0')
    ++ length;
++ length; // Depends on whether string contains the null character - usually I don't think it does?
program_name.resize(length); // Maybe use reserve instead?
std::cpy(program_name.data(), argv[0], length - 1); // Don't copy the null character at the end

Something like that anyway. I have't attempted to compile the above pseudocode because I am interested in the concept of the method not the fine detail of how this operation is done.

Upvotes: 1

Views: 775

Answers (1)

Mats Petersson
Mats Petersson

Reputation: 129334

In short, your implementation is pretty much how it works.

Ignoring the fact that std::string is implemented from std::basic_string which is templated to cope with various data types stored in the string (notably "wide characters"), std::string constructor from char * could be written something like this:

std::string(const char* init_value)
{
    size_t m_len = strlen(init_value);
    char *m_storage = new char[m_len+1];
    std::copy(m_storage, init_value, m_len+1);
}

Of course, the actual implementation will be more indirect [probably has a specific function to "grow/allocate", for example], due to the inheritance and templated nature of the real implementation.

Here's a REAL implementation out of libcxx:

template <class _CharT, class _Traits, class _Allocator>
inline _LIBCPP_INLINE_VISIBILITY
basic_string<_CharT, _Traits, _Allocator>::basic_string(const value_type* __s)
{
    _LIBCPP_ASSERT(__s != nullptr, "basic_string(const char*) detected nullptr");
    __init(__s, traits_type::length(__s));
#if _LIBCPP_DEBUG_LEVEL >= 2
    __get_db()->__insert_c(this);
#endif
}

where __init does this:

template <class _CharT, class _Traits, class _Allocator>
void
basic_string<_CharT, _Traits, _Allocator>::__init(const value_type* __s, size_type __sz)
{
    if (__sz > max_size())
        this->__throw_length_error();
    pointer __p;
    if (__sz < __min_cap)
    {
        __set_short_size(__sz);
        __p = __get_short_pointer();
    }
    else
    {
        size_type __cap = __recommend(__sz);
        __p = __alloc_traits::allocate(__alloc(), __cap+1);
        __set_long_pointer(__p);
        __set_long_cap(__cap+1);
        __set_long_size(__sz);
    }
    traits_type::copy(_VSTD::__to_raw_pointer(__p), __s, __sz);
    traits_type::assign(__p[__sz], value_type());
}

It does some tricks to store the value inside the pointer [and allocate with the relevant allocator, which may not be new], and explicitly initializes the end marker [traits_type::assign(__p[__sz], value_type());, as the call to __init may happen with a different argument than a C style string, so end marker is not guaranteed.

traits_type::length() is strlen

template <>
struct _LIBCPP_TYPE_VIS_ONLY char_traits<char>
{
...
    static inline size_t length(const char_type* __s) {return strlen(__s);}
....
};

Of course, other STL implementations may well use a different detail implementation, but roughly it is as my simplified example, but a bit more obfuscated to cope with many types and reusing code.

Upvotes: 3

Related Questions