Zack
Zack

Reputation: 473

Determining the Length of a String Literal

Given an array of pointers to string literals:

char *textMessages[] = {
    "Small text message",
    "Slightly larger text message",
    "A really large text message that "
    "is spread over multiple lines"
}

How does one determine the length of a particular string literal - say the third one? I have tried using the sizeof command as follows:

int size = sizeof(textMessages[2]);

But the result seems to be the number of pointers in the array, rather than the length of the string literal.

Upvotes: 18

Views: 25507

Answers (7)

Lundin
Lundin

Reputation: 213678

strlen is slow and potentially executed in run-time. Whereas sizeof("string_literal") - 1 is fast and executed at compile-time. The problem is how to use sizeof on string literals pointed at by your pointer array - we can't.

Now assuming you want this as fast as possible and also done at compile-time for performance reasons... Everything in C is possible if you throw enough ugly macros at the problem. Here's such a solution that favours performance and maintainability at the cost of readability.

We can move the string initializer list out of the array and into a macro. For example by declaring so-called "X-macros", like this:

#define STRING_LIST(X)                     \
  X("Small text message")                  \
  X("Slightly larger text message")        \
  X("A really large text message that "    \
    "is spread over multiple lines")

This macro can now be reused for various purposes, by defining another macro and passing it as parameter "X" to the above list. For example the array declaration could be done as:

#define STRING_INIT_LIST(str) str,
char *textMessages[] = 
{
  STRING_LIST(STRING_INIT_LIST)    
};

And if we want a 1-to-1 corresponding look-up table containing the sizes of each string:

#define STRING_SIZES(str) (sizeof(str)-1),
const size_t sizes[] = 
{
  STRING_LIST(STRING_SIZES)
};

Complete example containing both a look-up table version as well as a directly compile-time processing version:

#include <stdio.h>

#define STRING_LIST(X)                     \
  X("Small text message")                  \
  X("Slightly larger text message")        \
  X("A really large text message that "    \
    "is spread over multiple lines")

int main (void)
{
  #define STRING_INIT_LIST(str) str,
  char *textMessages[] = 
  {
    STRING_LIST(STRING_INIT_LIST)    
  };
  
  #define STRING_SIZES(str) (sizeof(str)-1),
  const size_t sizes[] = 
  {
    STRING_LIST(STRING_SIZES)
  };

  puts("The strings are:");
  #define STRING_PRINT(str) printf(str ", size:%zu\n", sizeof(str)-1);
  STRING_LIST(STRING_PRINT)

  printf("\nOr if you will:\n");
  for(size_t i=0; i<sizeof(textMessages)/sizeof(*textMessages); i++)
  {
    printf("%s, size:%zu\n", textMessages[i], sizes[i]);
  }
}

Output:

The strings are:
Small text message, size:18
Slightly larger text message, size:28
A really large text message that is spread over multiple lines, size:62

Or if you will:
Small text message, size:18
Slightly larger text message, size:28
A really large text message that is spread over multiple lines, size:62

The machine code of this boils down to printing a bunch of strings and constants from memory, no overhead strlen calls at all.

Upvotes: 1

privatepolly
privatepolly

Reputation: 239

My suggestion would be to use strlen and turn on compiler optimizations.

For example, with gcc 4.7 on x86:

#include <string.h>
static const char *textMessages[3] = {
    "Small text message",
    "Slightly larger text message",
    "A really large text message that "
    "is spread over multiple lines"
};

size_t longmessagelen(void)
{
  return strlen(textMessages[2]);
}

After running make CFLAGS="-ggdb -O3" example.o:

$ gdb example.o
(gdb) disassemble longmessagelen
   0x00000000 <+0>: mov    $0x3e,%eax
   0x00000005 <+5>: ret

I.e. the compiler has replaced the call to strlen with the constant value 0x3e = 62.

Don't waste time performing optimizations that the compiler can do for you!

Upvotes: 23

nemo
nemo

Reputation: 57619

You could exploit the fact, that values in an array are consecutive:

const char *messages[] = {
    "footer",
    "barter",
    "banger"
};

size_t sizeOfMessage1 = (messages[1] - messages[0]) / sizeof(char); // 7   (6 chars + '\0')

The size is determined by using the boundaries of the elements. The space between the beginning of the first and beginning of the second element is the size of the first.

This includes the terminating \0. The solution, of course, does only work properly with constant strings. If the strings would've been pointers, you would get the size of a pointer instead the length of the string.

This is not guaranteed to work. If the fields are aligned, this may yield wrong sizes and there may be other caveats introduced by the compiler, like merging identical strings. Also you'll need at least two elements in your array.

Upvotes: 1

Jens
Jens

Reputation: 72639

If you want the number computed at compile time (as opposed to at runtime with strlen) it is perfectly okay to use an expression like

sizeof "A really large text message that "
       "is spread over multiple lines";

You might want to use a macro to avoid repeating the long literal, though:

#define LONGLITERAL "A really large text message that " \
                    "is spread over multiple lines"

Note that the value returned by sizeof includes the terminating NUL, so is one more than strlen.

Upvotes: 28

Abhineet
Abhineet

Reputation: 5389

strlen gives you the length of string whereas sizeof will return the size of the Data Type in Bytes you have entered as parameter.

strlen

sizeof

Upvotes: 1

Vikdor
Vikdor

Reputation: 24124

You should use the strlen() library method to get the length of a string. sizeof will give you the size of textMessages[2], a pointer, which would be machine dependent (4 bytes or 8 bytes).

Upvotes: 0

Tudor
Tudor

Reputation: 62439

strlen maybe?

size_t size = strlen(textMessages[2]);

Upvotes: 0

Related Questions