zzca
zzca

Reputation: 13

Strip leading and trailing non alphabetical chars

char *process(char *string) {
    char *newWord = malloc(strlen(string) * sizeof(char) + 1);
    if (newWord == NULL) {
        fprintf(stderr, "Memory error.\n");
        exit(1);
    }

    char *sptr = string;
    char *nptr = newWord;
    char *lastLetter = newWord;

    // Skip leading non-alphabetical characters
    while (!isalpha(*sptr) && *sptr != '\0') {
        sptr++;
    }
    // Deal with empty string
    if (*sptr == '\0') {
        *newWord = '\0';
    }
    else {
        // Process all letters and keep track of last letter to remove trailing special characters
        while (*sptr != '\0') {
            if (isalpha(*sptr)) {
                *nptr = tolower(*sptr);
                lastLetter = nptr;
                nptr++;
            }
            sptr++;
        }
        // Remove trailing special characters by null-terminating after the last letter seen.
        *(lastLetter + 1) = '\0';
    }
    return newWord;
}

I have this function that returns a word after trimming leading and trailing non alphabetical chars. My problem is I am running into trouble figuring out what I need to change in order for it to also not remove non alphas in the middle of words like:

word-word should return word-word and not remove the '-'.

Another example is words like didn't and don't It is removing the apostrophe. Any help?

Upvotes: 1

Views: 42

Answers (1)

Adrian Mole
Adrian Mole

Reputation: 51864

You can do this more efficiently by making use of:

(a) The strdup() standard library function (which effectively does a malloc and strcpy in one fell swoop).

(b) 'Backward iteration' of the string, replacing non-alpha characters with a nul character until an alpha is found (stopping as soon as that happens).

char* process(char* string)
{
    // First, we can 'forward iterate' until we find an ALPHA character ...
    char* fp = string;
    while (*fp && !isalpha(*fp)) ++fp;
    // If we have found the NUL terminator, we have an empty string left...
    if (!*fp) return NULL; // Nothing left!

    // We don't need to check again for a valid (alpha) character: there WILL be at least one!
    char* result = strdup(fp); // Allocate memory and copy current string!
    // Now, we can 'backward iterate' until we get to an ALPHA...
    char* bp = result + strlen(result) - 1;
    while (!isalpha(*bp)) *bp-- = '\0'; // Replace with null character and THEN decrement

    // Finally, convert to lowercase:
    for (fp = result; *fp; ++fp) *fp = tolower(*fp);
    return result;
}

Replacing the trailing non-alpha characters with nul characters is potentially 'wasting' memory (the returned buffer will likely be longer than the actual string it contains), but it is simple. One could add a further strdup call on the 'result' string to handle this, if that is problematical.

Here is a short main that you can use to test the above function:

int main()
{
    char test[256];
    printf("Enter a string: ");
    scanf("%s", test);
    char* answer = process(test);
    if (answer) {
        printf("Processed string: %s", answer);
        free(answer);
    }
    else {
        printf("Nothing left after processing!");
    }
    return 0;
}

Please feel free to ask for any further clarification and/or explanation.

Upvotes: 1

Related Questions