Reputation: 185
I am trying to create a function that takes a string
, splits it into words
and return an array
with the words
in it. I am not allowed to use any pre-made functions other than malloc
within the splitting
function. Finally I have to set my function in this form char **ft_split_whitespaces(char *str)
My current output looks like that:
d this is me
s is me
s me
r
Expected output:
Hello
World
This
Is
Me
my full code is in the following codes:
#include <stdio.h>
#include <stdlib.h>
int count_words(char *str)
{
int i;
int word;
i = 0;
word = 1;
while(str[i]!='\0')
{
if(str[i]==' ' || str[i]=='\n' || str[i]=='\t'
|| str[i]=='\f' || str[i]=='\r' || str[i]=='\v')
word++;
i++;
}
return (word);
}
char **ft_split_whitespaces(char *str)
{
int index;
int size;
int index2;
char **arr;
index = 0;
index2 = 0;
size = count_words(str);
arr = (char **)malloc(size * sizeof(char));
if (arr == NULL)
return ((char **)NULL);
while (str[index])
{
if(str[index] == ' ')
{
index++;
value++;
index2++;
}
else
*(arr+index2) = (char*) malloc(index * sizeof(char));
*(arr+index2) = &str[index];
index++;
}
**arr = '\0';
return (arr);
}
int main()
{
char a[] = "Hello World This Is Me";
char **arr;
int i;
int ctr = count_words(a);
arr = ft_split_whitespaces(a);
for(i=0;i < ctr;i++)
printf("%s\n",arr[i]);
return 0;
}
Upvotes: 3
Views: 897
Reputation: 3465
You have quite a few errors in your program:
arr = (char **)malloc(size * sizeof(char));
is not right since
arr
is of type char**
. You should use sizeof(char*)
or better
(sizeof(*arr))
since sizeof(char)
is usually not equal to
sizeof(char*)
for modern systems.
You don't have braces {}
around your else
statement in
ft_split_whitespaces
which you probably intended. So your
conditional logic breaks.
You are allocating a new char[]
for every non--whitespace
character in the while
loop. You should only allocate one for
every new word and then just fill in the characters in that array.
*(arr+index2) = &str[index];
This doesn't do what you think it
does. It just points the string at *(arr+index2)
to str
offset
by index
. You either need to copy each character individually or
do a memcpy()
(which you probably can't use in the question). This
explains why your answer just provides offsets into the main string and
not the actual tokens.
**arr = '\0';
You will lose whatever you store in the 0th
index
of arr
. You need to individually append a \0
to each string in
arr
.
*(arr+index2) = (char*) malloc(index * sizeof(char));
You will end up
allocating progressively increasing size of char
arrays at because
you are using index
for the count of characters, which keeps on
increasing. You need to figure out the correct length of each token in
the string and allocate appropriately.
Also why *(arr + index2)
? Why not use the much easier to read arr[index2]
?
Further clarifications:
Consider str = "abc de"
You'll start with
*(arr + 0) = (char*) malloc(0 * sizeof(char));
//ptr from malloc(0) shouldn't be dereferenced and is mostly pointless (no pun), probably NULL
*(arr + 0) = &str[0];
Here str[0] = 'a'
and is a location somehwhere in memory, so on doing &str[0]
, you'll store that address in *(arr + 0)
Now in the next iteration, you'll have
*(arr + 0) = (char*) malloc(1 * sizeof(char));
*(arr + 0) = &str[1];
This time you replace the earlier malloc'd array at the same index2
again with a different address. In the next iterations *(arr + 0) = (char*) malloc(2 * sizeof(char));
. You end up resetting the same *(arr + index2)
position till you encounter a whitespace after which you do the same thing again for the next word. So don't allocate arrays for every index
value but only if and when required. Also, this shows that you'll keep on increasing the size passed to malloc
with the increasing value of index
which is what #6 indicated.
Coming to &str[index]
.
You are setting (arr + index2)
i.e. a char*
(pointer to char
) to another char*
. In C, setting a pointer to another pointer doesn't copy the contents of the second pointer to the first, but only makes both of them point to the same memory location. So when you set something like *(arr + 1) = &str[4]
, it's just a pointer into the original string at index = 4
. If you try to print this *(arr + 1)
you'll just get a substring from index = 4
to the end of the string, not the word you're trying to obtain.
**arr = '\0'
is just dereferencing the pointer at *arr
and setting its value to \0
. So imagine if you had *(arr + 0) = "hello\0"
, you'll set it to "\0ello\0"
. If you're ever iterating over this string, you'll never end up traversing beyond the first '\0'
character. Hence you lose whatever *arr
was earlier pointing to.
Also, *(arr + i)
and arr[i]
are exactly equivalent and make for much better readability. It better conveys that arr
is an array and arr[i]
is dereferencing the i
th element.
Upvotes: 3