JavascriptLoser
JavascriptLoser

Reputation: 1962

How to check if a string exists in an array of character arrays

I'm looking for a way to check if a specific string exists in a large array of strings. The array is multi-dimensional: all_strings[strings][chars];. So essentially, this array is an array of character arrays. Each character array ends in '\0'

Given another array of characters, I need to check to see if those characters are already in all_strings, kind of similar to the python in keyword.

I'm not really sure how to go about this at all, I know that strcmp might help but I'm not sure how I could implement it.

Upvotes: 3

Views: 12825

Answers (3)

user539810
user539810

Reputation:

You could simply check if a string exists in an array of strings. A better solution might be to actually return the string:

/*
 * haystack: The array of strings to search.
 * needle: The string to find.
 * max: The number of strings to search in "haystack".
 */
char *
string_find(char **haystack, char *needle, size_t max)
{
    char **end = haystack + max;
    for (; haystack != end; ++haystack)
        if (strcmp(*haystack, needle) == 0)
            return *haystack;
    return NULL;
}

If you're wanting the behavior of a set, where all strings in the array are unique, then you can use it that way:

typedef struct set_strings {
    char **s_arr;
    size_t count;
    size_t max;
} StringSet;
.
.
.
int
StringSet_add(StringSet *set, const char *str)
{
    // If string exists already, the add operation is "successful".
    if (string_find(set->s_arr, str, set->count))
        return 1;

    // Add string to set and return success if possible.
    /*
     * Code to add string to StringSet would go here.
     */
    return 1;
}

If you want to actually do something with the string, you can use it that way too:

/*
 * Reverse the characters of a string.
 *
 * str: The string to reverse.
 * n: The number of characters to reverse.
 */
void
reverse_str(char *str, size_t n)
{
    char c;
    char *end;

    for (end = str + n; str < --end; ++str) {
        c = *str;
        *str = *end;
        *end = c;
    }
}
.
.
.
    char *found = string_find(words, word, word_count);
    if (found)
        reverse_str(found, strlen(found));

As a general-purpose algorithm, this is reasonably useful and even can be applied to other data types as necessary (some re-working would be required of course). As pointed out by undefined behaviour's answer, it won't be fast on large amounts of strings, but it is good enough for something simple and small.

If you need something faster, the recommendations given in that answer are good. If you're coding something yourself, and you're able to keep things sorted, it's a great idea to do that. This allows you to use a much better search algorithm than a linear search. The standard bsearch is great, but if you want something suitable for fast insertion, you'd probably want a search routine that would provide you with the position to insert a new item to avoid searching for the position after bsearch returns NULL. In other words, why search twice when you can search once and accomplish the same thing?

Upvotes: 1

lurker
lurker

Reputation: 58244

Use a for loop. C doesn't have a built-in like Python's in:

int i;

for ( i = 0; i < NUM_STRINGS; i++ )
    if ( strcmp(all_strings[i], my_other_string) == 0 )
        break;

// Here, i is the index of the matched string in all_strings.
//   If i == NUM_STRINGS, then the string wasn't found

If you want it to act like Python's in, you could make it a function:

// Assumes C99
#include <string.h>
#include <stdbool.h>

bool string_in(char *my_str, char *string_list[], size_t num_strings)
{
    for ( int i = 0; i < num_strings; i++ )
        if (strcmp(my_str, string_list[i]) == 0 )
            return true;

    return false;
}

Upvotes: 3

autistic
autistic

Reputation: 15642

As lurker suggested, the naive method is to simply loop on the array of strings calling strcmp. His string_in function is unfortunately broken due to a misunderstanding regarding sizeof(string_list), and should probably look like this:

#include <string.h>
int string_in(char *needle, char **haystack, size_t haystack_size) {
    for (size_t x = 0; x < haystack_size; x++) {
         if (strcmp(needle, haystack[x]) == 0) {
             return 1;
         }
    }
    return 0;
}

This is fairly inefficient, however. It'll do if you're only going to use it once in a while, particularly on a small collection of strings, but if you're looking for an efficient way to perform the search again and again, changing the search query for each search, the two options I would consider are:

  • If all_strings is relatively static, you could sort your array like so: qsort(all_strings, strings, chars, strcmp);... Then when you want to determine whether a word is present, you can use bsearch to execute a binary search like so: char *result = bsearch(search_query, all_strings, strings, chars, strcmp);. Note that when all_strings changes, you'll need to sort it again.
  • If all_strings changes too often, you'll probably benefit from using some other data structure such as a trie or a hash table.

Upvotes: 5

Related Questions