Rila
Rila

Reputation: 197

C build string char by char with known MAX length

I'm trying to add characters to a string one by one. I have something like this:

void doline(char *line, char *buffer, char** tokens){
}

and i am calling it like:

char *line = malloc(1025 * sizeof(char *));
fgets(line, 1024, stdin);
int linelength = strlen(line);
if (line[linelength - 1] == '\n'){
    line[linelength - 1] = '\0';
}

char ** tokens = (char **) malloc(strlen(line) * sizeof(char *));
char *emptybuffer = malloc(strlen(line) * sizeof(char *));

parseline(line, emptybuffer, tokens);

So doline will go through line and tokenize it based on various conditions and place fragments of it into tokens. I am building the temp string in the variable buffer To do this, I need to go through line character by character.

I am currently doing:

buffer[strlen(buffer)] = line[i];

And then at the end of the loop:

*buffer++ = '\0';

But this is the result:

printf("Working on line: '%s' %d\n", line, strlen(line));

Outputs: Working on line: 'test' 4

But by the end of the function the buffer is:

*buffer++ = '\0';
printf("Buffer at the very end: '%s' %d\n", buffer, strlen(buffer));

Outputs: Buffer at the very end: 'test' 7

So the output is showing that the string is getting messed up. What's the best way to build this string character by character? Are my string manipulations correct?

Any help would be much appreciated!

Thanks!

Upvotes: 0

Views: 2583

Answers (1)

Sangeeth Saravanaraj
Sangeeth Saravanaraj

Reputation: 16627

There were some basic problems so I re-written the program.

#include <stdio.h>
#include <stdlib.h>

#define str_len 180

void tokenize(char *str, char **tokens)
{
    int length = 0, index = 0;
    int i = 0;
    int str_i;
    int tok_i;

    while(str[length]) {
        if (str[length] == ' ') {
            /* this charecter is a space, so skip it! */
            length++;
            index++;

            tokens[i] = malloc(sizeof(char) * index);

            tok_i = 0;           
            for (str_i=length-index ; str_i<length; str_i++) {
                tokens[i][tok_i] = str[str_i];
                tok_i++;
            }

            tokens[i][tok_i] = '\0';
            i++;
            index = 0;
        }
        length++;
        index++;
    }       

    /* copy the last word in the string */
    tokens[i] = malloc(sizeof(char) * index);
    tok_i = 0;           
    for (str_i=length-index ; str_i<length; str_i++) {
        tokens[i][tok_i] = str[str_i];
        tok_i++;
    }
    tokens[i][tok_i] = '\0';
    tokens[i++] = NULL;

    return;         
}

int main()
{
    char *str = malloc(str_len * sizeof(char));
    char **tokens = malloc(100 * sizeof(char *));
    int i = 0;

    if (str == NULL || tokens == NULL)
        return 1;

    gets(str);
    printf("input string: %s\n", str);
    tokenize(str, tokens);

    while(tokens[i] != NULL) {
        printf("%d - %s \n", i, tokens[i]);
        i++;
    }

    while(tokens[i])
        free(tokens[i]);
    free(tokens);
    free(str);

    return 0;
}

It is compiled and executed as follows:

$ gcc -ggdb -Wall prog.c 
$ ./a.out 
this is a test string... hello world!! 
input string: this is a test string... hello world!! 
0 - this  
1 - is  
2 - a  
3 - test  
4 - string...  
5 - hello  
6 - world!!  
$ 

There were few basic assumptions:

  1. the length of the incoming string is assumed to a constant. This can be done dynamically - please check this - How to read a line from the console in C?.

  2. The length of the tokens array is also assumed to be a constant. This can also be changed. I will leave that to you to find out how!

Hope this helps!

Upvotes: 1

Related Questions