Reputation: 187
Here's my code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct sentence {
char *words[8];
struct sentence *next;
};
void read_sentence(struct sentence *head, char *words) {
struct sentence *temp, *pointer;
temp = malloc(sizeof *temp);
pointer = head;
int i = 0;
char *delimiter = " ";
char *word = strtok(words, delimiter);
while (word != NULL) {
temp -> words[i++] = word;
word = strtok(NULL, delimiter);
}
while (pointer -> next) {
pointer = pointer -> next;
}
pointer -> next = temp;
}
struct sentence *split_sentences(char *buf) {
struct sentence *head;
head = malloc(sizeof *head);
head -> next = NULL;
char *delimiter = ".";
char *splitted = strtok(buf, delimiter);
while (splitted != NULL) {
read_sentence(head, splitted);
splitted = strtok(NULL, delimiter);
}
return head;
}
int main(int argc, char const *argv[]) {
struct sentence *iter = split_sentences("foo bar. baz qux");
}
What this code essentially does is it parses an input ("foo bar. baz qux"
) and then it constructs a linked list of sentences using the struct sentence
which should have reference to the words within each sentence along with a reference to the next sentence.
Here's the valgrid output:
Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
Command: ./a.out
Process terminating with default action of signal 11 (SIGSEGV)
Bad permissions for mapped region at address 0x400741
at 0x4EDAB9D: strtok_r (strtok_r.c:72)
by 0x400635: split_sentences (in /home/C/a.out)
by 0x4006A0: main (in /home/C/a.out)
It looks like there's an issue with the nested strtok?
Upvotes: 1
Views: 150
Reputation: 12698
You cannot nest strtok()
calls, as it maintains internal state, used to follow next strings in its first parameter.
If you need to nest strtok()
calls in several loop levels, first execute a loop with the first string to get a list of pointers to extracted parts... then iterate that list to use strtok(3)
on those strings... get more and more arrays, then iterate all those arrays to further split the input string.
Upvotes: 0
Reputation: 17438
strtok
is not reentrant because it stores some state information in a static char *
variable. (Think about it: when the first parameter is NULL
, it continues where it left off on the original first parameter, so it needs to store the position where it left off somewhere.)
POSIX.1-2001 and later defines a re-entrant alternative to strtok
called strtok_r
which stores its state information in storage provided by the caller. strtok_r
is not part of the C standard, but should be available on a POSIX.1-2001 compatible system.
Looking at your Valgrind output, it mentions strtok_r
, so presumably the C library implementation of strtok
is using strtok_r
internally on your system. Therefore, you should be able to use it in your program.
void read_sentence(struct sentence *head, char *words) {
struct sentence *temp, *pointer;
temp = malloc(sizeof *temp);
pointer = head;
int i = 0;
char *saveptr;
char *delimiter = " ";
char *word = strtok_r(words, delimiter, &saveptr);
while (word != NULL) {
temp -> words[i++] = word;
word = strtok_r(NULL, delimiter, &saveptr);
}
while (pointer -> next) {
pointer = pointer -> next;
}
pointer -> next = temp;
}
struct sentence *split_sentences(char *buf) {
struct sentence *head;
head = malloc(sizeof *head);
head -> next = NULL;
char *saveptr;
char *delimiter = ".";
char *splitted = strtok_r(buf, delimiter, &saveptr);
while (splitted != NULL) {
read_sentence(head, splitted);
splitted = strtok_r(NULL, delimiter, &saveptr);
}
return head;
}
Both strtok
and strtok_r
modify the buffer containing the string being split. Therefore, you cannot use them on a string literal because string literals are stored in a non-modifiable, anonymous array of char
. Therefore, you need to change your main
function to pass a modifiable array of char
to split_sentences
.
int main(int argc, char const *argv[]) {
char sentences[] = "foo bar. baz qux";
struct sentence *iter = split_sentences(sentences);
}
Upvotes: 1
Reputation: 18410
strtok()
modifies the string. From the manpage:
These functions modify their first argument.
These functions cannot be used on constant strings.
But here
split_sentences("foo bar. baz qux");
you do exactly that. Try the same on a mutable string buffer, for example with
split_sentences(strdup("foo bar. baz qux"));
Furthermore, you might indeed need to use strtok_r()
, because you are interleaving strtok()
-calls with two different buffers. This will not lead to a segmentation fault, but yields incorrect results.
Upvotes: 4