snoopy91
snoopy91

Reputation: 347

Implementing a split string by delimiter function in C

I'm attempting to write a function in C which accepts a pointer to contiguous characters ending in '\0' - that is, a string - and a single constant character delimiter, and then outputs a pointer to contiguous pointers, each of which points to a new string. These new strings correspond to the input string broken at each delimiter character and then properly terminated. In fewer words, I want to dynamically build an array of string.

To do this I plan to use malloc() to allocate the memory I need as I go. The "parent array" will be sizeof(char *) * (count + 2) bytes long, to accommodate a pointer to the first character of each delimited substring, plus a terminator. Likewise, each "child array" will be sizeof(char) * (j + 1) bytes long to accommodate all the characters of each substring, again plus a terminator.

My code so far is this.

#include <stdio.h>
#include <stdlib.h>

char *split(char *string, const char delimiter);

int main(int argc, char *argv[]) {
    char *x = split(argv[1], '.');
    while (*x) {
        printf("%d\n", *x);
    }
    return 0;
}

char *split(char *string, const char delimiter) {
    int length, count, i, j = 0;
    while(*(string++)) {
        if (*string == delimiter) count++;
        length++;
    }
    string -= length;
    char *array = (char *)malloc(sizeof(char *) * (length + 1));
    for(i, j = 0; i < (count + 1); i++) {
        while(*(string++) != delimiter) j++;
        string -= j;
        *array = (char *)malloc(sizeof(char) * (j + 1));
        while(*(string++) != delimiter) *(*array++) = *(string++);
        **array = '\0';
        string++;
        array += sizeof(char *);
    }
    *array = '\0';
    array -= (sizeof(char *) * (length + 1));
    return array;  
}

My question is why is the compiler spitting out the following errors?

split2.c: In function ‘split’:
split2.c:25: warning: assignment makes integer from pointer without a cast
split2.c:26: error: invalid type argument of ‘unary *’ (have ‘int’)
split2.c:27: error: invalid type argument of ‘unary *’ (have ‘int’)

My guess is that when the memory for the "parent array" is allocated, the compiler expects that int values, not char * will be stored there. If this is the case, how do I properly correct my code?

I am aware there are far easier ways to do this sort of thing using string.h; my motivation for writing this code is to learn better how pointers work in C.

Many thanks in advance!

Upvotes: 2

Views: 7131

Answers (3)

raj raj
raj raj

Reputation: 1922

I think you want array as a double pointer, char **array.

char **array = (char **)malloc(sizeof(char *) * (length + 1));

As your logic says, you want an array of char*, each one pointing to a string. and so array should be double pointer. If you do this modification, change the return type also, to char**.

If you would like to use double pointers, try this:

char **split(char *string, const char delimiter) {
    int length = 0, count = 0, i = 0, j = 0;
    while(*(string++)) {
        if (*string == delimiter) count++;
        length++;
    }
    string -= (length + 1); // string was incremented one more than length
    char **array = (char **)malloc(sizeof(char *) * (length + 1));
    char ** base = array;
    for(i = 0; i < (count + 1); i++) {
        j = 0;
        while(string[j] != delimiter) j++;
        j++;
        *array = (char *)malloc(sizeof(char) * j);
        memcpy(*array, string, (j-1));
        (*array)[j-1] = '\0';
        string += j;
        array++;
    }
    *array = '\0';
    return base;  
}

Free this array later, like:

i = 0;
while(base[i]) {
    free(base[i]);
    i++;
}
free(base);
base = NULL;

Upvotes: 5

Shar
Shar

Reputation: 455

char *array = (char *)malloc(sizeof(char *) * (length + 1));

should be

char **array = (char **)malloc(sizeof(char **) * (length + 1));

and

*array = (char *)malloc(sizeof(char) * (j + 1));

should be

array[i] = (char *)malloc(sizeof(char) * (j + 1));

You seems to be a beginner, I suggest you to prefer array[i] than use *array or other pointer manipulation, this is more simple at the beginning.

Upvotes: 0

legends2k
legends2k

Reputation: 33004

    *array = (char *)malloc(sizeof(char) * (j + 1));

should be

    array = (char *)malloc(sizeof(char) * (j + 1));  // malloc returns a pointer, no need to dereference here

and then this

    while(*(string++) != delimiter) *(*array++) = *(string++);

should be

    while(*(string++) != delimiter) *array++ = *(string++); // dereferenceing once would do

and finally this

    **array = '\0';

should be

    *array = '\0'; // same as above

The reason for all the above changes are the same. array is a pointer and not a pointer to a pointer.

Additionally, in your code, the loop index i has never been initialized and hence is bound to lead to non-deterministic behaviour. Either initialize it in declaration like

int length, count, i = 0, j = 0;

or in the loop initialization like

for(i = 0, j = 0; i < (count + 1); i++) {

Hope this helps!

Upvotes: 2

Related Questions