SS'
SS'

Reputation: 857

Splitting a string by a delimiter in C

Given a path

/level1/level2/level3/level4

I want to be able to split this string such that I can retrieve each individual entry,
i.e "level1", "level2", "level3", "level4".

Now my first idea was using strtok, but apparently most people recommend against using this function. What is another approach so that I can pass in a string (char* path) and get each entry split at "/".

Upvotes: 2

Views: 1065

Answers (2)

Schwern
Schwern

Reputation: 164879

Splitting Unix paths is more than just splitting on /. These all refer to the same path...

  • /foo/bar/baz/
  • /foo/bar/baz
  • /foo//bar/baz

As with many complex tasks, it's best not to do it yourself, but to use existing functions. In this case there are the POSIX dirname and basename functions.

  • dirname returns the parent path in a filepath
  • basename returns the last portion of a filepath

Using these together, you can split Unix paths.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <libgen.h>

int main(void) {
    char filepath[] = "/foo/bar//baz/";

    char *fp = filepath;
    while( strcmp(fp, "/") != 0 && strcmp(fp, ".") != 0 ) {
        char *base = basename(fp);
        puts(base);

        fp = dirname(fp);
    }

    // Differentiate between /foo/bar and foo/bar
    if( strcmp(fp, "/") == 0 ) {
        puts(fp);
    }
}

// baz
// bar
// foo
// /

It's not the most efficient, it does multiple passes through the string, but it is correct.

Upvotes: 4

dbush
dbush

Reputation: 223992

strtok is actually the preferred way to tokenize a string such as this. You just need to be aware that:

  • The original string is modified
  • The function uses static data during its parsing, so it's not thread safe and you can't interleave parsing of two separate strings.

If you don't want the original string modified, make a copy using strdup and work on the copy, then copy the results as needed. If you need to worry about multiple threads or interleaved usage, use strtok_r instead which has an additional state parameter.

Upvotes: 5

Related Questions