Robert Wallner
Robert Wallner

Reputation: 43

char arrays in PROGMEM

In a program, I have a lot of arrays of different length strings, and each array is declared as an array of pointers to those strings, like:

static const char * num_tab[] = {"First", "Second", "Third"};
static const char * day_tab[] = {"Sunday", "Monday", "Tuesday"};
static const char * random_tab[] = {"Strings and arrays can have", "diferent", "lenghts"};

The (pointer to) strings are returned from simple functions such as:

const char * dayName(int index) {
  return day_tab[index];
}

Under the AVR architecture, those strings need to be stored in program memory. I understand that the functions need to be changed in order to work also on AVR's, (they need to copy the string from program memory to a buffer in ram, and return a pointer to that instead).

How can I change the arrays' initialization to use PROGMEM, without the need to name each individual string?

The only way I found is to define each string with a name (and PROGMEM), and define an array of pointers initialized with pointers to those strings:

static const char d1[] PROGMEM = "First";
static const char d2[] PROGMEM = "Second";
static const char d3[] PROGMEM = "Third";

const char * const day_tab[] = {d1, d2, d3}; // only needs PROGMEM for large arrays

This works, but for large arrays of different sizes, it changes the code from a few lines to hundreds, which makes maintenaince practically imposible. Also, adding or removing a value from an array, will need a renumbering of all of the following items.

Upvotes: 4

Views: 492

Answers (2)

Lundin
Lundin

Reputation: 213266

The AVR PROGMEM fix can be hidden behind a macro like:

#ifdef __AVR__
  #define MEM PROGMEM
#else
  #define MEM /* dummy macro */
#endif

Regarding performance and allocation:

The advantage of having a pointer-based look-up table as in the first example is execution speed - you'll be able to grab each string quickly with no run-time overhead.

The down-side is that you have to allocate the pointers themselves too, so it wastes extra flash or in worst case RAM+flash in case the pointer table is copied down from ROM to RAM during start-up.

This answer is only applicable in case you really need to save flash over everything else. But also if you want to centralize code maintenance to a single list. You can then cook up something evil-looking using "X macros", optionally with a name for each string:

#define STR_LIST(X)     \
  X(d1, "first")        \
  X(d2, "second")       \
  X(d3, "third")        \

You can then allocate this stuff adjacently in the same big array, like this:

static const char STRINGS[] MEM =
{
  #define STR_ALLOC(name, str) str "\0"
  STR_LIST(STR_ALLOC)
};

This uses string concatenation for convenience, but with manually added null terminators since string concatenation would otherwise remove them. It expands to:

"first" "\0" "second" "\0" "third" "\0"

And gets concatenated to:

 "first\0second\0third\0"

(We'll get an extra null terminator at the end, but that might be neat for "sentinel value" purposes.)


Various dirty hacks for "named" access in run-time (flash size over speed optimization) - full code example:

#define STR_LIST(X)     \
  X(d1, "first")        \
  X(d2, "second")       \
  X(d3, "third")        \

#ifdef __AVR__
  #define MEM PROGMEM
#else
  #define MEM /* dummy macro */
#endif

static const char STRINGS[] MEM =
{
  #define STR_ALLOC(name, str) str "\0"
  STR_LIST(STR_ALLOC)
};

typedef enum
{
  #define STR_ENUM(name, str) STR_ENUM_##name,
  STR_LIST(STR_ENUM)
  
  STR_ENUM_N
} str_enum_t;

static str_enum_t key;
#define STR_COUNT(name, str) +(STR_ENUM_##name<key ? sizeof(str) : 0)

#define STR_GET_POS(name) (key=STR_ENUM_##name, STR_LIST(STR_COUNT))

#include <stdio.h>
int main (void)
{
  puts("Memory dump, | marks null terminators:");
  for(size_t i=0; i<sizeof(STRINGS); i++)
  {
    printf("%c", STRINGS[i]=='\0' ? '|' : STRINGS[i]);
  }
  puts("");puts("");

  puts("What the strings are named:");
  #define STR_PRINT_NAME(name, str) printf("%s: %s\n", #name, str);
  STR_LIST(STR_PRINT_NAME)
  puts("");

  puts("Where the strings are found:");
  printf("%s %2zu ", "d1", STR_GET_POS(d1));
  printf("%s\n", &STRINGS[STR_GET_POS(d1)]);
  printf("%s %2zu ", "d2", STR_GET_POS(d2));
  printf("%s\n", &STRINGS[STR_GET_POS(d2)]);
  printf("%s %2zu ", "d3", STR_GET_POS(d3));
  printf("%s\n", &STRINGS[STR_GET_POS(d3)]);
}

Output:

Memory dump, | marks null terminators:
first|second|third||

What the strings are named:
d1: first
d2: second
d3: third

Where the strings are found:
d1  0 first
d2  6 second
d3 13 third

Upvotes: 0

emacs drives me nuts
emacs drives me nuts

Reputation: 3888

As your question is tagged "C", GNU-C and named address-spaces as of ISO/IEC DTR 18037 may be a way to go. Compile with -std=gnu99 or higher:

#define F(X) ((const __flash char[]) { X })

static const __flash char *const __flash nums[] =
{
    F("first"), F("second"), F("third")
};

#include <stdio.h>

void print_num (int id)
{
    printf ("num = %S\n", nums[id]);
}

The generated code for the nums[] array is:

    .section    .progmem.data,"a",@progbits
    .type   nums, @object
    .size   nums, 6
nums:
    .word   __compound_literal.0
    .word   __compound_literal.1
    .word   __compound_literal.2
    .type   __compound_literal.2, @object
    .size   __compound_literal.2, 6
__compound_literal.2:
    .string "third"
    .type   __compound_literal.1, @object
    .size   __compound_literal.1, 7
__compound_literal.1:
    .string "second"
    .type   __compound_literal.0, @object
    .size   __compound_literal.0, 6
__compound_literal.0:
    .string "first"

The code to print the string uses LPM to read nums[id], which is again in progmem and printed using the %S format specifier:

print_num:
    lsl r24  ;  33  [c=8 l=2]  *ashlhi3_const/1
    rol r25
    movw r30,r24     ;  26  [c=4 l=1]  *movhi/0
    subi r30,lo8(-(nums))    ;  8   [c=8 l=2]  *addhi3/1
    sbci r31,hi8(-(nums))
    lpm r24,Z+   ;  27  [c=8 l=2]  *movhi/2
    lpm r25,Z+
    push r25         ;  11  [c=4 l=1]  pushqi1/0
    push r24         ;  13  [c=4 l=1]  pushqi1/0
    ldi r24,lo8(.LC0)    ;  14  [c=4 l=2]  *movhi/4
    ldi r25,hi8(.LC0)
    push r25         ;  16  [c=4 l=1]  pushqi1/0
    push r24         ;  19  [c=4 l=1]  pushqi1/0
    call printf  ;  20  [c=16 l=2]  call_value_insn/1
     ; SP += 4   ;  21  [c=4 l=4]  *addhi3_sp
    pop __tmp_reg__
    pop __tmp_reg__
    pop __tmp_reg__
    pop __tmp_reg__
    ret      ;  31  [c=0 l=1]  return

Note: You can port this to archs without __flash by means of:

#ifndef __FLASH
#define __flash /* empty */
#endif

The macro __FLASH is a builtin macro in avr-gcc and only defined when address-space __flash is present.

Print modifier %S prints a string in progmem / flash. (On compliant platforms it stands for a wide string.) So you have to use %s instead.

Upvotes: 3

Related Questions